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AUTOMATIC  CHARACTER  RECOGNITION: 
A  STATE-OF-THE-ART  REPORT 

M.    E.   Stevens 

ABSTRACT 

A  state-of-the-art  report  on  current  progress  in  automatic  character 
recognition  is  presented.     Areas  of  applicability  and  possibilities  for  con- 
trolled solutions  to  automatic  character  reading  problems  are  discussed. 
Some  commonly  used  methods  for  character  recognition,   the  steps  involved 
in  a  generalized  recognition  process,   and  comparative  characteristics  of 
certain  representative  character  recognition  systems  are  considered. 
Prospects  for  further  progress,   including  potentially  related  research  in 
pattern  recognition,    are  reported. 

1.    INTRODUCTION 

Current  progress  in  research  looking  toward  large-scale  information  selection  and  retrieval 
systems  or  toward  mechanized  translation,    has  been  accompanied  by  increased  attention  to  problems 
of  input,   file  preparation,    and  file  maintenance.     These  two  are  important  areas  for  the  potential 
application  of  automatic  information  processing  techniques.     In  both  there  is  an  increasingly  well- 
recognized  need  to  prepare  large  files  or  a  large  volume  of  input    messages  in  machine -usable 
language.     In  addition,   the  availability  of  natural  language  textual  material  in  machine -usable  form 
is  becoming  increasingly  important  for  further  progress  in  research  in  linguistics,    which  should 
pace  further  promising  developments  in  both  information  selection  and  mechanized  translation. 

This  report  of  the  Research  Information  Center  and  Advisory  Service  on  Information 
Processing  is  one  of  a  series  intended  as  contributions  to  the  improvement  of  cooperation  in  the 
fields  of  information  selection  systems,   information  retrieval  research,   and  mechanized  translation. 
In  these  fields,   as  in  others,    the  use  of  automatic  data  processing  equipment  has  both  posed  new 
problems  and  offered  new  solutions  to  the  handling  of  large  masses  of  information.     From  a  systems 
standpoint,   a  particularly  critical  problem  is  the  need  to  copy,    to  transcribe  and  in  some  cases  to 
transliterate  large  volumes  of  data  for  further  processing  by  machine.     The  background  of  the 
present  study,    certain  general  observations,   and  a  brief  discussion  of  the  presentation  to  be  followed 
are  given  first. 

1.  1    Background 

The  development  of  automatic  techniques  for  transcribing  data  from  typed  or  printed  form  to  a 
form  directly  usable  by  machines  holds  considerable  promise  for  eventual  application  in  a  wide 
variety  of  data  processing  operations.   Such  automatic  techniques  consist,    in  general,   of  devices  or 
machine  systems  that  are  capable  of  performing  the  following  operations: 

(1)  Feeding  paper  or  other  carrier  material  that  bears  typed  or  printed  alphanumeric 
information  to  a  sensing  station; 

(2)  Controlling  the  position  of  each  line  and  each  symbol  for  scanning; 

(3)  Scanning  and  processing  the  two-dimensional  visual  image  presented; 

(4)  Comparing  an  input  image  as  a  pattern  against  a  set  of  reference  patterns  representing 
the  vocabulary  of  symbols  to  be  recognized; 

(5)  Identifying  the  input  pattern  as  the  product  of  the  scanning-recognition  process; 

(6)  Providing  an  output  symbol  representing  the  particular  input  symbol  that  has  been 
recognized. 

Such  techniques  promise  major  advantages  in  a  wide  variety  of  data  processing  and  data  handling 
situations,    specifically  including  those  of  mechanized  information  selection  systems  and  mechanized 
translation.     These  potential  advantages  include  reductions  in  manpower  requirements  as  well  as 
reductions  of  errors  and  inaccuracies  found  in  manual  data  transcription  operations. 

This  report  incorporates  the  results  of  a  fact-finding  survey  of  automatic  character  reading 
techniques  which  was  conducted  by  the  National  Bureau  of  Standards  for  the  Rome  Air  Development 
Center  in  the  period  1956-1957.     The  earlier  survey  has  been  extended  and  supplemented  by  study 
of  the  available  literature,    by  continuing  inspections  of  character  reader  devices,   and  by  periodic 
discussions  with  research  and  development  personnel  interested  in  automatic  character  recognition 


techniques.     The  purpose  of  this  report  is  therefore  to  review  current  developments  in  the  field  of 
character  recognition,   with  emphasis  upon  actual  devices  designed  to  identify  printed  or  typed 
information. 

"That  the  blind  may  read,  "  — '  is  the  earliest  recorded  objective  for  research  attempts  to  develop 
devices  capable  of  reproducing  and  transcribing  printed,  .typed,   or  handwritten  characters.     From  the 
Optophone  of  Fournier  d'Albe  as  demonstrated  in  1912  _/  to  the  latest  contractor's  progress  reports 
to  the  Prosthetic  and  Sensory  Aids  Service  of  the  "Veterans  Administration,  JV   this  objective  has  been 
approached  in  two  distinct  ways.     The  first  avenue  is  that  of  reproduction,    such  as  pantographic 
copying  in  a  different  size  or  at  a  distance.     This  direct  translation  approach  includes  many  different 
means  for  transducing  the  black  and  white  character  to  some  other  form.     Examples  include  the 
conversion  of  a  distinctive  two-dimensional  visual  pattern  to  a  distinctive  pattern  of  sounds  4/  and  the 
projection  of  the  character  pattern  to  a  pattern  of  raised  pins  on  the  head-surface  of  a  hand-held 
probe.  _5/  We  note  that:     "Such  reading  machines  have  no  'brains'  whatever  built  into  them.     They 
require  on  the  part  of  the  user  the  mastery  of  a  code  which  bears  no  relation  to  anything  he  may  have 
learned  before.  "  6/ 

The  second  avenue,    conversely,    is  that  of  the  development  of  techniques  capable  of  reading  and 
recognition  in  the  sense  of  identifying  a  given  character  image  as  a  specific  one  of  a  number  of 
images  in  a  reference  vocabulary.     That  is,   the  character  is  'recognized'  and  not  merely  copied  or 
transformed.     We  thus  define  'recognition'  as  a  process  requiring  a  decision  as  to  which  specific  one 
of  a  plurality  of  possible  input  patterns  was  in  fact  'sensed'  by  a  suitable  scanning -detection 
mechanism.     Such  a  decision  is  used  to  select  a  particular  one  of  a  plurality  of  possible  output 
patterns,    whether  it  is  a  phonetic  sound,    associated  with  an  alphabetic  character  or  the  appropriate 
code-sequence  for  direct  computer  processing.     Ultimately  we  may  hope  for  the  machine -recognizer 
to  do  more:    in  effect,   to  override  as  necessary  the  formal  implications  of  a  received  input  pattern. 
There  is,   for  example,   no  acceptable  word  in  English  with  the  character  sequence  "a,    c,   o,   u,   i,  .  .  .  " 
so  that  "q"  shouuld  be  substituted  for  the  "o"  by  applying  context-dependency  rules.     It  is  this  second 
approach,   involving  recognition  decision,   that  we  shall  refer  to  by  such  terms  as  "automatic  reading 
techniques"  and  "automatic  character  recognition". 

Various  distinctions  have  been  made  in  the  literature  between  'character  recognition'  and 
'pattern  recognition'.     For  example,    it  has  been  noted  that:     "  'Pattern'  .  .  .   means  much  more  than 
'character  pattern!  and  generically  connotes  an  arrangement  or  inter -orientation  with  no  necessary 
semantic  value".   — '    On  the  other  hand,    it  has  been  observed  that:     "The  distinction  between  a 
'character'  and  a  'pattern'  is  important  because  both  a  Gothic  and  an  Arabic  A  may  be  considered  to 
be  the  same  'character',   but  are  entirely  different  patterns".   £./   In  this  report  we  shall  be  concerned 
with  the  recognition  of  character  patterns,    that  is,    with  appropriate  subsets  of  the  set  of  two- 
dimensional  visual  patterns  which  display  substantial  color-contrast  between  figure  and  ground  and 
which  are  equated  for  various  practical  purposes  with  the  letters  of  the  alphabet,    the  digits  of  a 
numbering  system,   and  punctuation  marks.     We  shall  also  consider  certain  aspects  of  the  more 
general  field  of  pattern  detection,    pattern  recognition,   and  pattern  formulation  from  the  point  of  view 
of  indicating  fruitful  areas  for  further  progress. 


1/ 
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1/ 

i/ 
5/ 
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Kerwood  uses  exactly  this  wording  as  title  for  a  brief  survey  of  reading  aids,    "...    That  the 
blind  may  read,  "  Ref.  253.    (Note:     The  notation  "Ref  xxx"  refers  to  the  correspondingly 
numbered  item  in  the  bibliography,   pp.  137-168. 

Fournier  d'Albe,     E.  E.    "A  reading  optophone",    Ref.    145.    See  also  Refs.    2,    144,    146,    149, 
386,    549. 

Freiberger  and  Murphy  (Ref.  149)  provide  a  recent  review  as  of  March  1961.     Representative 
progress  reports  include  Refs.    1,    3,    4,    120,    121,    300. 

See,   for  example,    Refs.    38,    149,    507,    508,    509,    541,    544,    545,    547,    548,    549. 

That  is,   a  "tactile  replica"  device.     See  Surber,    "Reading  and  writing  device  for  the  blind", 
Ref.    465.     See  also  Refs.    149,    412,    434,    487,    507  (pp  9-10),    508  (p.  6). 

Ritter,    C.  G.      "Technical  research  and  blindness",    Ref.  386,    p.    17. 

Young,    D.A.    "Automatic  character  recognition",    Ref.  539,    p.    2. 

Grimsdale,    R.  L.  ,    F.H.    Sumner,    C.J.    Tunis  and  T.    Kilburn.      "A  system  for  the  automatic 
recognition  of  patterns",    Ref.    185,    p.    210. 


1.  2    General  Observations 

The  present  state  of  the  art  in  automatic  character  recognition,   as  we  have  defined  it,    ranges 
from  magnetic  ink  character  recognition  devices  to  continuing  activity  in  research  on  character  and 
pattern  recognition  techniques.     It  covers  both  optical  readers  now  in  actual  field  use  for  applications 
involving  limited  character  vocabularies  or  specialized  fonts,   and  operational  prototypes  of  devices 
capable  of  reading  a  typed  or  printed  page.     In  the   1957  survey,   the  following  evaluation  of  the  then 
state  of  the  art  was  made: 

"In  summary,-  then,   the  current  state  of  the  art  of  automatic  character  recognition 
provides  reasonably  good  prospects  for  reading  limited  vocabularies  of  good  quality 
material  where  carrier-item  handling  and  positioning  difficulties  can  be  closely  controlled. 
For  large -vocabulary  multiple-type  carrier-item  applications,    considerable  development 
effort  will  still  be  required.     This  is  especially  true  in  the  areas  of  full-page  text  reading, 
in  the  reading  of  average  quality  typewritten  material  at  normal  spacing,    and  in  the  reading 
of  foreign  language  materials.     The  possibilities  of  developing  a  standardized  type-face  for 
general  use  in  both  printed  and  typed  material  should  materially  reduce  the  logical  require- 
ments in  automatic  readers  as  well  as  produce  other  benefits  to  management.     Paper 
handling  and  positioning  problems  will  require  extensive  engineering  for  improvements 
necessary  to  deal  with  a  variety  of  materials  in  varied  formats  from  a  variety  of  sources 
over  which  there  is  no  administrative  control.  "  \J 

In  a  later  survey  of  the  field,   Minot  — '  made  the  following  evaluation: 

"Commercially  profitable  devices  for  pattern  recognition,   as  of  1  April  1959,    include 
devices  for  (1)  reading  capital  letters  and  numbers  of  a  selected  type  face  or  faces,   and 
(2)  counting  objects  of  certain  sizes  and  shapes.     The  extent  of  research,    development, 
and  interest  in  this  field  and  allied  fields  suggests  strongly  that  within  two  to  five  years 
much  more  complicated  tasks  of  'visual'  recognition  will  be  performed  automatically.  " 

The  American  Bankers  Association  recommended  adoption  of  magnetic  ink  for  automatic 
character  recognition  in  1956,  ±1    There  is  now  widespread  commitment  to  and  use  of  this  "MICR" 
system  in  various  banking  operations.     The  Farrington  Company,   through  its  subsidiary,   the 
former  Intelligent  Machines  Research  Company,    can  claim  the  pioneering  application  of  optical        .  / 
reading  techniques  in  actual  production  situations,   with  an  installation  at  Reader's  Digest  in  1955. -!■ 
Farrington  -  IMR  today  has  more  than  40  machines  installed  for  various  customers,   including  the 
Rome  Air  Development  Center  J>/  which  has  a  page -reader  in  operation.     A  typed-page  reader  for 
double-space,   upper  case,    12-point  standard  elite  type,   has  also  been  demonstrated  for  the  U.S. 
Army  Signal  Research  and  Development  Laboratory,   by  Control  Instruments  Company,   a  subsidiary 
of  the  Burroughs  Corporation..^/   Other  organizations  currently  engaged  in  development  of  full-page 
character  readers,  "some "'ol'^wKich  are  designed  for  Cyrillic  alphabets,   include  Baird- Atomic, 
Philco,    RCA,   and  Rabinow  Engineering.     In  a  survey  for  the  journal  Management  and  Business 
Automation,   for  September  I960,    Keller,   after  citing  Farrington-IMR  automatic  reader  applications, 
made  the  following  statements: 
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Minot,    O.N.     "Automatic  devices  for  recognition  of  two-dimensional  visual  patterns:    a  survey 
of  the  field",    Ref.    310,   p.    23. 

The  report,    "Magnetic  ink  character  recognition:    the  common  language  for  check  handling", 
prepared  by  the  Technical  Subcommittee  on  Mechanization  of  Check  Handling  of  the  Bank 
Management  Commission,   American  Bankers  Association,    was  published  in  the  journal 
Banking  in  August  1956.     See  Ref.    289,   also  Ref s.    8,    9,    10,    11,    12,    13,    14. 
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— '      Keller,  A.  E.    "Optical  scanning  -  an  unlimited  horizon",    Ref.    249,   p.    25. 
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— '     "Electronic  machine  reads  typewritten  material",    Ref.    115;  see  also  Ref.    333. 

— '       Deckert,    W.W.    "The  recognition  of  typed  characters",   Ref.    87. 


"IBM,    GE,    RCA,   and  NCR  are  among  those  expected  to  announce  their  entry  into  the 
optical  scanning  field  at  any  moment.     The  National  Data  Processing  Company,    Dallas, 
has  announced  plans  to  install  the  first  optical  scanning  system  to  handle  retail  accounts 
receivable.  "  _V 

The  IBM  1418  Optical  Character  Reader  has  already  been  announced,    with  deliveries  promised  for 
early  1962.    2/   The  equipment  will  be  capable  of  reading  IBM  407  type  font,    10  characters  to  the  inch, 
or  the  407-E  font,    7  characters  to  the  inch,   as  imprinted  by  plastic  charge  plates.    3/ 

Many  other  companies,   both  in  the  United  States  and  abroad,   have  developed  either  magnetic  ink 
recognition  devices,    or  optical  reading  equipment,     or  both.     Demonstrations  of  additional  full-page 
readers  for  printed  and  typed  material  are  scheduled  for  the  early  months  of  1961.     In  most  of  these 
cases,   there  are  limited,    specially  designed  vocabularies  of  characters  to  be  recognized.     Engineer- 
ing considerations  have  generally  been  more  significant  in  these  development  programs  than  questions 
of  research  in  recognition  logic  have  been. 

In  the  United  States,    a  list  of  companies  and  organizations  interested  in  the  field  of  either 
magnetic  or  optical  scanning  would  be  out-dated  almost  as  soon  as  such  a  list  could  be  alphabetized. 
Most'of  the  major  U.S.    concerns  engaged  in  data  processing,    manufacture  of  office  equipment,   and 
development  of  communication  systems,   are  actively  interested  in  possibilities  for  automatic 
character  recognition.     For  example,   among  the  organizations  not  yet  committed  to  a  specific  product 
development  line,   the  Remington  Rand  Division  of  the  Sperry  Rand  Corporation  reports  that  it  is 
actively  investigating  the  developmental  and  marketing  possibilities  for  optical  as  well  as  magnetic  ink 
scanning  and  recognition.     New  entries  in  the  field  are  to  be  noted  almost  daily.     Thus,   in  the  March 
1961  issue  of  Datamation,    4/an  advertisement  reveals  that  a  long-range  program  in  pattern  and 
character  recognition  has  been  initiated  at  the  General  Motors.  Research  Laboratories,   Warren, 
Michigan. 

Elsewhere  than  in  the  United  States,    optical  character  recognition  techniques  are  being  investi- 
gated,  for  example,   in  a  number  of  institutions  in  the  Soviet  Union:    at  IBM  Deutschland,   at  the 
Technische  Hochschule,    Karlsruhe,   and  at  Standard  Elektrik  Lorentz,    in  West  Germany;  at 
universities  at  the  National  Physical  Laboratories,   and  by  commercial  organizations  in  the  United 
Kingdom.      Perotto  of  Olivetti,    Milan,   and  Gamba,    of  the  University  of  Genoa,    represent  interests  in 
the  field  in  Italy.  £./     Research  and  development  activities  in  automatic  reading  techniques  are  in 
progress  at  the  Eleccrotechnical  Institute,    Japan,   and  a  prototype  reading  machine  developed  by 
Sakai  of  the  University  of  Kyoto  was  demonstrated  in  February  of  1961.    W 

In  the  three  and  a  half  years  that  have  elapsed  since  our  previous  report  on  this  subject,   there- 
fore,  we  note  that  there  has  indeed  been  progress.     However,   there  has  been  rather  less  progress, 
with  certain  notable  exceptions,    than  might  have  been  expected  considering  that  the  domain  of 
potential  applicability  is  so  widespread.     Within  this  wide  area,    certain  problems  can  perhaps  be 
solved  no  other  way.     Thus  the  state  of  the  art  remains  characterized  by  both  paradox  and  promise. 
There  are  paradoxical  aspects  because  that  which  can  be  regarded  as  "everybody's  problem"  tends  to 
remain  "nobody's  problem".     Minot  in  his   1959  study  7/  specifically  states  that  such  advances  as 
recognition  of  many  styles  of  printed  characters,   handwriting,    landmarks,    silhouettes,    reading  of 
diagrams,   and  the  like,    must   await  "mostly  the  necessary  economic  push". 

There  also  appears,    paradoxically,    to  be  some  tendency  to  wait  for  solutions  to  the  most 
sophisticated  problems  in  the  field  rather  than  to  test  the  already  available  aids  to  mechanized 
information  processing.     The  feasibility  of  applying  automatic  reading  techniques  for  good  quality 
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Keller,   A.  E.     "Optical  scanning  ...    ",    Ref.    249,    p.    24.     See  also  Ref.    90,    p.    39. 

See  Refs.    222,    223,    224,    234,    349. 

"IBM  general  information  manual.      1401  Data  Processing  System.      1418  Optical  Character 
Reader,  "    Ref.    232. 

Datamation,    7:3    (March  1961)    p.    59. 

See  Ref.    154. 

See  Ref.    332. 

Minot,    O.N.     "Automatic  devices  for  recognition  of  visible  two-dimensional  patterns:    a  survey 
of  the  field.  "    Ref.    310. 


printed,  typed,  or  tabulator -listed  copy  does  not  necessarily  depend  upon  the  achievement  of  high 
accuracy  in  letter-by-letter  recognition  of  cursive  handwriting.  On  the  other  hand,  in  fact,  hand- 
written numerals  have  already  been  recognized  with  accuracies  of  80%  or  better. 

There  also  seems  to  be  some  insistence  on  gaining  a  number  of  major  new  advantages 
concurrently.     That  is,    some  proposed  performance  specifications  call  for  markedly  increasing  the 
speed  of  transcription  while  reducing  costs  and  also  simultaneously  improving  upon  present 
standards  of  reliability.     A  suggested  performance  standard  for  an  automatic  character  reader,   for 
example,  _/ includes  the  following  statements: 

"A  fundamental  requirement  is  reliability.     All  characters  within  its  scope  should  be  read 
correctly  in  spite  of  bad  printing,    dirt  and  smudges,   impurities  in  paper,   misalignment,   and 

other  unavoidable  imperfections on  no  account  should  it  misread  one  character  for 

another. " 

Such  requirements  are  both  impractical  and  unrealistic  in  the  face  of  the  fact  that  manual  errors  are 
now  lived  with,   even  in  carefully  controlled  accounting  operations.     As  many  as  2  in  10,000  key- 
punched characters  may  be  erroneous,    even  after  verification.     Without  verification,   the  keypunching 
error  rate  may  be  as  high  as   1  in  500  characters.     In  addition,    as  much  as  4%  or  more  of  the  original 
data  may  be  in  error  by  virtue  of  human  mistakes  in  copying,    say,    a  stock  or  catalog  number. 

We  note  that  there  is  promise,   however,   both  in  the  area  of  development  of  automatic  character 
recognition  devices  and  in  the  area  of  potentially  related  research.     The  use  of  optical  character 
recognition  techniques  for  special-purpose  alphabets  has  been  independently  adjudged  economic  in  at 
least  one  nationwide  industry  in  the  United  States. jV  Several  different  page-readers  for  a  reasonable 
variety  of  Cyrillic  typeface  styles  are  approaching  the  stage  of  demonstration.     A  proposed  reader  for 
microfilmed  pages  of  Russian  text,    intended  for  input  to  a  mechanized  translation  program,    was 
described  by  Baird-Atomic  representatives  at  hearings  of  the  Committee  on  Space  and  Astronautics, 
U.S.   House  of  Representatives,    in  the  Spring  of  I960.    3/  At  least  one  organization  engaged  in 
development  of  automatic  reading  techniques,   the  Rabinow  Engineering  Company,   is  considering  a 
character  recognition  system  for  a  10,  000-character  Chinese  font. 

Tests  recently  conducted  by  the  Bureau  of  Supplies  and  Accounts,    Department  of  the  Navy,   for 
typewritten  material  prepared  in  the  field  show  surprisingly  good  results  considering  the  lack  of 
control  of  the  quality  of  input.     Figure   1  shows  samples  of  the  material,    prepared  on  typewriters  with 
a  slightly  modified  typeface,    which  was  successfully  read  by  a  Farrington-IMR  reader  (Figure   la)  or 
rejected  by  the  machine  as  too  light  (Figure   lb),    or  too  dark  (Figure   lc).     Other  samples  were 
rejected  as  being  improperly  prepared  (Figure   Id).     It  should  be  noted  that  those  who  were  to  type  the 
material  for  the  tests  were  specifically  instructed  not  to  make  strikeovers.     Even  on  preliminary 
tests,   after  sorting  out  improperly  prepared  documents,    80%  or  more  of  the  samples  were  read. 
Under  these  conditions,   the  preliminary  observed  error  rate  of  3.03%,   for  documents,   but  of  only 
0.  33%  in  terms  of  characters,    appears  promising.  2J 

We  note  also  that  the  "too  light"  category  of  Figure   1  is  to  be  interpreted  in  terms  of  the 
conditions  of  the  test.     Specifically,   these  included  card  characteristics  such  as  the  use  of  format  and 
field  boundary  lines  and  preprinted  fixed  information.     Although  the  carrier  background  was 
deliberately  imprinted  in  light  blue  ink,   the  typed  impression  in  some  cases  gave  approximately  the 
same  intensity-contrast  values  as  did  that  background.     The  same  characters  on  clear  white  bond 
paper  would  often  have  been  readable  by  the  same  machine. 

An  important  distinction  for  purposes  of  evaluating  current  progress  and  promise  in  the  field  is 
to  be  made  between  "rejects"  and  "errors".     The  term  "reject"  refers  to  nonrecognition,   or  failure 
to  identify,    while  "error"  refers  to  a  mis  recognition,    or  false  identification  such  as  a  mistaking  of  an 
"A"  for  an  "R".    \J  If,    in  the  operational  requirements  of  a  particular  proposed  application  of 
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Broido,    D.     "Recent  work  on  reading  machines  for  data  processing,  "    Ref.    61. 
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Data  through  the  courtesy  of  B.    Radack,   Systems  Research  Division,    Bureau  of  Supplies  and 
Accounts,    Department  of  the  Navy. 

Compare,   for  example,   the  definition  of  Marill  and  Green:     "Each  physical  sample  will  be 
considered  as  'actually  belonging  to1  one  of  .  .  .   n    categories.     Whenever  a  pattern  recognizer 
assigns  to  category    j      a  physical  sample  actually  belonging  to  a  category  other  than  j,     we  say 
that  the  pattern  recognizer  has  made  an  error.  "    (Ref.    294,    p.    472.) 
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Figure   1.     Samples  of  Typed  Material  for  Test  of 
Automatic  Character  Recognition 


automatic  reading  techniques,   the  minimization  of  true  errors  is  a  critical  factor,   the  system  can 
often  be  designed  so  that  the  threshold  for  rejects  is  low  enough  to  avoid  all  but  a  very  small  percent- 
age of  errors,    say,    1  in  100,  000,   but  at  the  price  of  manual  processing  of  reject  material.     It  is,   as 
we  have  noted,   one  of  the  paradoxes  of  the  difference  between  promise  and  performance  in  practical 
use  of  character  recognition  systems  to  date  that  potential  customers  insist  on  minimum-error-rate 
specifications  for  a  machine  of  1  in  100,  000,    or  even  more  stringent  requirements,    whereas   1  or 
more  errors  in  1,  000  characters  punched,   or  1  in  500  characters  typed,    is  not  unusual  for  manual 
operations. 

Let  us  suppose,   however,   that  we  are  concerned  with  the  transcription  of  natural  language  texts 
into  machine -usable  form.     There  is  to  be   100%  verification  (complete  duplicate  re-punching,    with 
locked  controls  for  any  difference  with  prior-punched  material).     Under  such  conditions,   organiza- 
tions engaged  in  the  machine  processing  of  natural  language  material  report  an  average  cost  for  input 
keypunching  with  verification  ranging  between  1.  5  and  5  cents  per  word.    £_/    Let  us  further  suppose 
conservatively,    that  the  equivalent  cost  for  machine  character  recognition  systems  is  not  more  than 
a  tenth  of  a  cent  per  word.    2/   Then,    we  would  still  have  a  comparative  cost  of  $2,  500  per   100,  000 
words  for  the  manual  system  vs.   less  than  half  this  cost  with  even  a  40%  reject  rate.    3/ 

An  important  indication  of  future  promise,    therefore,    lies  in  the  realistic  appraisal  of  acceptable 
reject  rates,   adjusted  so  as  to  minimize  the  possibility  of  true  errors.     We  shall  discuss  the  question 
of  the  relationship  between  economic  break-even  points  and  possible  reject  rates  in  more  detail  in  a 
later  section  of  this  report. 

Finally,    in  the  area  of  promise,   we  note  that  research  efforts  are  moving  forward  in  potentially 
related  techniques  for  the  large  and  open-ended  vocabularies  of  possible  character  configurations 
needed  for  the  future.     For  example,   in  related  research  areas,    computer  simulations  of  several    .  / 
different  methods  for  recognition  of  handprinted  or  handwritten  material  have  been  demonstrated.   — ' 
It  has  been  reported  that  Solartron  will  produce  a  prototype    machine  for  reading  handwritten 
numerals  in  1961;  W a  device  for  this  purpose  is  being  demonstrated  at  Rabinow  Engineering 
laboratories,    and  an  experimental  constrained  handwritten  character  recognizer  was  displayed  by 
IBM  at  the   1961  Western  Joint  Computer  Conference.     Special  devices  and  computer  simulation 
techniques  have  also  been  used  in  pattern  recognition  experiments  to  identify  abstract  geometric 
shapes.    W   Thus  we  find  a  spectrum  ranging  from  special-purpose  pattern  detection  devices,    where 
the  "patterns"  allowed  are  limited  to  the  specially  designed,   highly  stylized,    quality  controlled 
characters  in  a  very  limited  vocabulary  (whether  magnetically  or  optically  sensed)  to  pattern 
recognition  techniques  which  seek  to  equate  any  character  that  is  recognizable  by  human  beings  as 
"A"  with  any  of  the  other  possible  printed,   typed,    block-printed  or  handwritten  versions  of  "A". 

From  the  standpoint  of  current  development  of  operational  devices  and  the  application  of  auto- 
matic readers  in  productive  data  processing  situations,   however,   the  following  major  factors  affect 
the  feasibility  or  the  cost  of  automatic  character  recognition,    in  a  descending  order  of  importance: 

(1)  Quality  of  input; 

(2)  Size  and  nature  of  the  vocabulary  of  characters  to  be  recognized; 

(3)  Carrier  handling  requirements; 

(4)  Reliability  requirements; 

(5)  Flexibility  in  making  adjustments  to  meet  changing  requirements. 

Of  far  less  significance,   as  we  see  the  present  progress  in  this  field,   are  such  questions  as: 
Particular  scanning  techniques --for  example,    whether  optical  or  electronic;  particular  recognition 
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Data  presented  at  the  Third  Institute  on  Information  Storage  and  Retrieval,    The  American 
University,   Washington,    D.  C.  ,    February  1961. 

Or  less,   and  we  should  note  that  this  consideration  includes  equipment  amortization,    which  is 
not  possible  for  that  part  of  manual  operations  involving  personnel  time  and  costs. 

That  is,    100,  000  words  at  $0.  001  per  word;  40,  000  words  at  a  manual  rate  (for  rejects)  at 
$0.  025  per  word. 

See  for  example  Bledsoe,    W.  W.   and  I.   Browning.     "Pattern  recognition  and  reading  by 
machine",    Ref.  51.     Doyle,   W.     "Recognition  of  sloppy,   hand-printed  characters,  "    Ref.  104,  and 
Estavan,    D.     "Pattern  recognition,   machine  learning,   and  automated  teaching.  "    Ref.  125. 
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—        Young,    D.A.     "Automatic  character  recognition",    Ref.    539,   p.    3. 


6/ 


See  for  example  Clark,    W.A.   and  B.G.Farley.     "Generalization  of  pattern  recognition  in  a  self- 
organizing  system,  "  Ref.  76;  Harmon,    L.  D.     "A  line-drawing  pattern  recognizer,  "    Ref.  194; 
Hodes,    L.     "Recognition  of  outline  figures,  "  Ref.  216;  Uhr,  L.   and  Vossler,  C.     "A    pattern  recog- 
nition program,  "  Ref.  497,   and  Unger.S.H.    "Pattern  detection  and  recognition,  "  Ref.  501. 


logics --however  simple  or  however  sophisticated;  rate-of-recognition,   or  reject  rates.     With  respect 
to  recognition  rate,    or  reading  speeds,    for  example,    we  find  a  range  of  several  seconds  per 
character  for  computer  analysis  and  decision  in  the  case  of  handwritten  characters,    several  hundred 
characters  per  second  for  limited  vocabulary  character  readers  already  in  field  use,   and  one  or  more 
thousand  characters  per  second  in  readers  designed  for  alphanumeric  page  reading.     We  note 
especially  that  the  speed  at  which  characters  can  be  identified  may  exceed  the  speed  at  which  blocks 
of  such  characters  can  be  moved  into  position  for  scanning  or  the  speed  at  which  the  desired  output 
patterns  (such  as  holes  punched  into  cards  or  paper  tape)  can  be  produced.   ±J  For  machines  with 
reading  rates  of  better  than  1,000  characters  per  second,    direct  output  to  magnetic  tape  is  almost 
mandatory  if  the  character  recognition  system  is  not  to  'hurry  up  and  wait'  during  a  large  part  of  the 
reading  cycle. 

Closely  similar  conclusions  with  respect  to  the  predominance  of  the  factor  of  quality  of  input  have 
been  stated  by  the  founder  of  Intelligent  Machines  Research,    (now  Farrington  Electronics)  as  follows: 

"However,   by  far  the  most  significant  aspect  of  most  practical  character  sensing 
applications  is  the  necessity  to  cope  with  imperfections  in  printing,   imprinting,    writing 
of  the  characters  themselves,    which  imperfections  can  confound  any  rule,   and  in  the 
extreme  will  confound  the  human  also.     When  printing  is  very  good  very  little  improve- 
ment can  perform  wonders.     When  very  poor,   the  most  sophisticated  logic  runs  into 
trouble.     Most  practical  applications  lie  in  between.  "  £./ 

1.3    Presentation 

For  these  reasons,   then,   the  present  survey  of  current  progress  in  the  field  of  automatic 
character  recognition  will  be  concerned  with  the  general  areas  of  applicability  of  character:  recog- 
nition devices,    with  operational  requirements  in  representative  applications,   and  with  possibilities 
for  controlled  solutions  to  character  recognition  problems,   as  well  as  with  descriptions  of  selected 
techniques  and  systems  and  discussion  of  the  prospects  for  further  progress. 

The  presentation  of  this  report  on  current  progress  in  automatic  character  recognition  is 
intended  to  accommodate  various  depths  of  interest,    in  either  scope  or  coverage  of  technical  detail, 
on  the  part  of  readers  having  different  outlooks  with  respect  either  to  areas  of  applicability  or  to 
questions  of  instrumentation.     We  have  therefore:     (1)    considered  information  retrieval  or 
mechanized  translation  applications  as  only  a  special  case  of  general  data-processing  applications 
of  automatic  character  reading  techniques,    (2)  discussed  certain  commonly  employed  methods  for 
character  recognition  in  simplified  or  general  terms,    (3)  chosen  selected  examples  to  illustrate 
important  concepts  and  current  progress,   ±1  and  (4)  considered  the  broader  field  of  automatic  pattern 
recognition  as  providing  research  results  clearly  indicative  of  potential  applicability  to  improvements 
in  practical  character  recognition  devices. 

2.     AREAS  OF  APPLICABILITY  OF  AUTOMATIC  READING  TECHNIQUES 

In  the  broad  sense,    the  ultimate  areas  of  applicability  of  automatic  character  recognition  tech- 
niques include: 

(1)  All  documentation  activities  where  data  originally  recorded  in  handwritten,   typed, 
stencilled,    or  printed  form  must  be  copied,    re-recorded,   or  transcribed  to  the 
same  or  to  another  form  and  onto  the  same  or  onto  different  types  of  recording 
media; 

(2)  All  activities  where  handwritten,   typed,    stencilled,    or  printed  data  may  be  verified, 
counted,    sorted  or  selected  without  the  need  for  human  intervention  in  these  operations; 

(3)  All  activities  where  other  material  bearing  handwritten,    typed,    stencilled,    or  printed 
information  may  be  sorted,    counted,    selected,    or  routed  on  the  basis  of  the  data  so 
recorded,   again  without  human  intervention  in  the  data  identification  processes. 
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Compare,    for  example,   the  following:     "NCR's  reader  has  an  instantaneous  reading  rate  of 
approximately  11,000  characters  per  second.     In  actual  practice,    rates  are  slower  than  this 
because  of  the  difficulty  of  moving  paper  documents  at  speeds  permitting  such  fast  reading.  " 
(Ref.  346,   p.    28.) 

Shepard,    D.  H.  ,    private  communication,    February  25,    1958.     Compare  also  the  following: 
"Control  of  the  source  and  quality  of  printing  is  ultra-important1.     Once  this  is  achieved, 
the  equipment  will  take  over  and  do  an  effective  job.  "    (Ref.    90,    p.    39.) 

In  the  choice  of  illustrative  examples,   however,    there  has  been  some  deliberate  emphasis  on 
early  references,    precedents  in  the  patent  literature,    and  foreign  developments,    since  these 
are  likely  to  be  less  well-known  than  other  sources  to  readers  who  have  a  general  familiarity 
with  progress  in  the  field  of  automatic  character  recognition  systems. 
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In  this  sense  of  character  recognition,    we  are  first  concerned  with  the  transcription  of  humanly- 
legible  alphabetic- and  numeric  information  to  some  other  form,   as  in  an  aural  character-code  for  the 
blind,   or  as  required  for  some  communication-transmission  system.     Next,    we  note  the  applicability 
of  character  reading  devices  as  an  alternate  to  the  punching  and  subsequent  interpretation  of  punched- 
hole  patterns  on  cards  or  tape  for  use  in  such  data-processing  operations  as  selecting,    counting, 
sorting,   and  deriving  various  statistical  tabulations  and  analyses.     "With  respect  to  the  documentation 
and  utilization  of  scientific  information,   there  are  many  possible  applications  in  the  preparation  of 
bibliographies,    listings  of  references  and  acquisitions,    catalog  cards  and  index  entries;  in  copying  of 
abstracts;  in  re-editions  of  handbooks,    catalogs  and  manuals  with  minor  revisions  of  text,   and  in 
automatic  proofreading,    especially  where  the  latter  task  is  tedious,   painstaking,   and  requires  careful 
concentration  because  of  the  absence  of  meaningful  context. 

In  the  field  of  mechanized  information  selection  and  retrieval  systems,   potential  needs  for  auto- 
matic character  reading  range  from  the  creation  of  machine -usable  files  of  documents,   messages,   and 
records  of  subject  content  analyses,   through  the  use  of  machines  as  aids  to  the  analysis  (e.g.  ,   by 
various  'automatic  extraction'  techniques  ±t    in  the  derivation  of  indexing  and  selection  entries),   to  the 
automatic  dissemination  of  selected  abstracts  or  texts  to  the  users  via  communication  channels.     The 
applicability  of  automatic  character  recognition  techniques  in  mechanized  selection  and  retrieval 
systems  is, in  turn,    complemented  by  the  applicability  of  information  selection  and  retrieval  techniques 
in  the  general  field  of  pattern  recognition,    including  the  special  field  of  character  recognition.     Thus, 
as  Minot  has  remarked,    2/  "Recognition  beyond  mere  reception  of  a  stimulus  may  be  considered  a 
matter  of  information  storage  and  then  retrieval  in  accordance  with  stimulus  cues.  " 

This  complementarity  is  less  surprising  if  we  consider  that  increased  use  of  machine  processes 
for  information  selection  and  retrieval  will  be  concerned  with  basic  problems  in  the  recognition  of 
patterns --patterns  in  information,   the  data  itself,   and  its  surrogates,   including  linguistic  and  graphic 
representations.     That  is,   improved  solutions  to  selection  and  retrieval  problems  may  well  depend  on 
answers  to  such  questions  as  how  a  machine  can  be  designed  or  programmed  to  recognize  a  structure 
embedded  within  another  structure.     Other  questions  relate  to  machine  processing  potentialities  for 
recognizing  synonymity- -of  word  with  word,   of  drawing  with  word,    of  equation  or  formula  with  text 
or  with  drawing,   and  of  described  function  with  given  structure,   as  in  the  search  to  correlate 
particular  configurations  in  chemical  structures  with  therapeutic  effects.     In  terms  of  practical  applic- 
ability in  the  immediate  future,   however,    the  areas  of  application  in  information  selection  and 
retrieval  operations  will  be  primarily  those  in  which  information  is  to  be  transcribed  for  machine 
processing  or  for  transmission  via  communication  links,    where  manual  transcription  is  too  slow,    too 
costly,   or  unable  to  cope  with  the  volume. 

Early  patents  pertinent  to  the  art  of  automatic  character  recognition  include  devices  for  tran- 
scribing copy  for  use  in  communication  systems  and  for  statistical  counting  or  tabulation.     A  patent 
for  a  statistical  machine,   issued  to  Goldberg,   in  1928,  —/provides  for  such  a  statistical  device. 
Handel  similarly  disclosed  a  statistical  counting  invention  employing  symbol  and  character 
recognition  in  a  patent  issued  in  1933. Z./   It  is  interesting  to  note  that  while  the  Handel  device  must 
discriminate  between  symbols  in  order  to  tabulate  items  in  each  symbol  category,   the  use  of  this 
device  as  a  reader  is  nowhere  claimed  in  the  patent.     Tauschek,   however,    in  his  application  for 
patent  filed  in  1929,  _'    specifically  claimed  to  effect  "the  reading  of  such  characters  or  indications 
and  thus  replace  reading  by  a  person.  "    Moreover,   he  foresaw  the  possible  use  of  reading  machines 
for  such  purposes  as  "controlling  any  desired  machine,   however,    it  will  be  found  very  useful  in 
connection  with  office  machines,    .  .  .    testing  and  counting  security  bank  notes  and  so  forth  and  for 
controlling  automations  and  the  like.  "    Two  patents  issued  in  1931,  Jyto  Parker  and  Weaver, 
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See  Luhn,   H.  P.     "Auto-encoding  of  documents  for  information  retrieval  systems."    Ref.  281; 
Luhn,   H.  P.     "The  automatic  creation  of  literature  abstracts  (auto-abstracts)",    Ref.  282; 
Swanson,    D.  R.     "Searching  natural  language  text  by  computer",    Ref.  467;  and  System  Develop- 
ment Corporation.     "Research  Directorate  quarterly  report.  "    Ref.  469. 

Minot,    O.N.     "Automatic  devices  for  recognition  of  visible  two-dimensional  patterns;  a  survey 
of  the  field.  "    Ref.  310. 

Goldberg,   E.     "Statistical  machine,  "    U.  S.  Patent  1,  838,  389;  Ref.  169. 

Handel,    P.W.     "Statistical  machine,  "    U.  S.  Patent  1,  915,  993;  Ref.  191. 

Tauschek,    G.     "Reading  machine,  "    U.  S.  Patent  2,  026,  329;  Ref.  477. 


-'       Parker,    R.D.     "Telegraph  reading  machine,  "    U.  S.  Patent  1,  815,  986,    Ref.  352. 
Weaver,   A.     "Telegraph  reading  machine,  "    U.S.  Patent  1,815,996,    Ref.  527. 


respectively,   provided  somewhat  similar  recognition  devices  whereby  text  could  be  automatically 
transcribed  for  telegraphic  transmission. 

As  we  have  previously  noted,    one  of  the  first  of  the  potential  applications  to  be  actively  explored 
was  the  proposed  development  of  mechanical  devices  for  transcribing  information  from  the  printed 
page  to  signals  for  the  use  of  the  blind,    thus  eliminating  the  need  for  human  transcription  from 
printing  to  some  other  recording  medium  such  as  Braille.     Such  devices  would  also  minimize  the  need 
for  multiple  copies  of  material  in  Braille  form  by  providing  multiple  transcription  machines  to  be 
used  by  the  blind  reader  when  and  where  required.     Reading  aids  for  the  blind  which  are  of  the  trans- 
ducer (direct  translation)  type,    involving  no  identification  or  recognition  properly  speaking,   include 
the  Naumburg  Visagraph  as  well  as  the  original   Optophone,   ±J   the  RCA  A-2  or  'flying  pencil' 
reader,    2/  and  several  Batelle  devices.  3/ 

On  the  other  hand,   the  recognition-type  devices  have  usually  been  found  to  be  less  susceptible  to 
practical  and  economic  implementation  as  aids  to  the  blind.    Thus   Zworykin,    Flory,   and  Pike, 
reviewing  the  state  of  the  art  in  1948,   observed  that: 

"In  the  case  of  the  letter-recognition  system,    little  is  known  of  any  early  work  although 
many  proposals  have  been  made.     Quite  generally  they  have  suggested  an  optical  matching 
system  which  would  look  at  the  letters  to  be  identified  and  run  through  a  set  of  letters,   for 
instance  on  a  drum,   until  the  optical  match  of  the  unknown  letter  was  found  at  which  time  a 
signal  would  be  given  to  activate  the  indicator  corresponding  to  the  letter.  "  2.1 

In  these  relatively  early  years  it  could  be  concluded  that  the  recognition-type  system  proposals  would 
result  in  devices  that  would  be  both  costly  and  cumbersome. 

A  decade  later,   at  the  Fifth  Technical  Session  on  Reading  Machines  for  the  Blind,   held  in  1958, 
it  was  still  necessary  to  emphasize  that  the  more  complex  recognition-type  machines  would  be 
expensive,    large  in  size,   and  therefore  best  used  "in  a  library  or  other  central  facility.  "  W 
Intermediate  machines,   giving  a  reasonable  correlation  between  a  character  configuration  on  the  typed 
or  printed  page  and  a  distinctive  sound,    where  the  printed  pattern  is  analyzed  in  terms  of  zones  which 
distinguish  letters  with  ascenders  or  descenders  from  those  without,   have  of  course,   been  under 
development  for  some  years.    6/  It  is  still  true,   however,    even  with  the  more  recent  progress  in 
automatic  reading  techniques,    that  character  recognition  systems  do  not  yet  meet  the  requirements  of 
either  low  cost  or  portability  necessary  for  the  blind.     In  fact,   the  commercially  available  character 
reader,   as  of  May  1961,   may  have  a  capital  investment  cost  of  $120,000  to  $200,000  or  more,  JJ 
may  have  a  power  consumption  rate  of  6  kilowatts ,  °/ and  may  displace  approximately  the  same  cubic 
space  as  a  small  digital  computer. 

9/ 
The  first  published  report  of  the  RCA  devices  for  the  blind  appeared  in  1946,   — '  the  same  year  in 

which  the  computer  ENIAC  first  went  into  productive  operation.     The  age  of  Tauschek's  "automations" 
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Founder  d'Albe,    E.  E.     Refs.  144,  145,  146.     See  also  Refs.    61,    149,199,243,386,507,508,509. 

Donahue,   W.    "Research  with  A-2  Reader.     Final  report.  "    Ref.  102.   See  also  Refs.  370,  371,  372. 

Refs.    1,2,  3,4,  149.     Various  intermediary  or  partial-recognition  devices  are  mentioned  in 
Refs.  149  and  243,   and  those  described  therein  or  in  the  various  summary  reports  of  the 
Technical  Sessions  on  Reading  Aids  for  the  Blind  (Refs.  507,  508,  509)  include  devices  of  Argyle, 
Blum,    Brown,   Schutkowski,   and  Thomas,   among  others. 


4/ 

-'       See  Ref.    549,    p.    484. 


5/ 


6/ 

1/ 
1/ 
9/ 


See  "Summary,    Fifth  Technical  Session,  "  Ref.    509,  p.  1.     Compare  also  the  following:     "It  seems 
doubtful  that  the  recognition  machine  can  be  made  simply  and  cheaply  enough  to  be  a  personal 
reading  machine;  rather  it  seems  likely  to  find  its  place  in  libraries  and  institutions."   (Ref.    199, 
p.  24.) 

See,   for  example,    Mauch,   H.A.     "The  development  of  a  reading  machine  for  the  blind,"  Ref.  300, 
also  Refs.  149,  301,    509. 

See,   for  example,    Refs.  224  and  330,   for  cited  prices  of  $133,800  and  $190,000  respectively. 

For  example,   the  Solartron  ERA  for  Boots  Pure  Drug  Company,   See  Ref.  117,    p.  970. 

Refs.    370,    548. 
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was  at  hand.     This  initial  use  of  a  high-speed  electronic  digital  computer  led  rapidly  to  the  develop- 
ment and  utilization  of  a  number  of  large-scale  digital  data  processing  machines  for  many- 
applications,    with  almost  revolutionary  impact  on  traditional  data  processing  operations.     Today, 
there  are  literally  thousands  of  computers  large  and  small  in  use  in  many  different  parts  of  the  world. 
Computers  are  widely  used  for  scientific  and  engineering  computations;  for  statistical  analyses  and 
tabulations;  for  accounting,    recording -keeping  and  paperwork  processing  in  business,    industry,   and 
government;  as  tools  for  research  in  universities  and  elsewhere.     More  recently,   the  use  of 
computers  for  mechanized  literature  search  and  information  retrieval  and  for  mechanized  translation 
has  received  increasing  attention. 

The  emergence  of  automatic  data  processing  techniques  has  emphasized  the  need  for  automatic 
reading  devices  and  has  generated  new  areas  of  applicability  for  such  machines.     The  high  speed  of 
the  automatic  data  processing  system  calls  for  rapid  input  of  information  in  order  to  keep  the  central 
processor  busy.     Automatic  data  processing  equipment  can  produce  enormous  outputs  at  high  speed, 
such  as  tables  of  numerical  values,   which  may  require  extensive  checking  or  proofreading. 
Computers  and  related  processing  equipment  demand  machine -usable  files  and  machine -usable  output, 
including  re-usable  output,    such  as  "turn-around"  documents.  _V    In  addition,   it  has  been  suggested 
that  a  future  use  of  automatic  reading  techniques  will  be  for  mechanized  input  to  the  computer  of 
handwritten  programs.  _£/ 

Even  more  significant  than  the  input  demands  of  automatic  data  processing  equipment,   however, 
is  the  fact  that  the  effective  utilization  of  the  new  equipment  demands  the  re-engineering  of  the  entire 
information  cycle  from  data  origination  to  data  usage  in  order  to  provide  an  integrated  processing 
system.     We  can  safely  predict  that  this  will  be  as  true  for  the  mechanized  library  of  the  future  as  it 
is  today  for  many  large-scale  business  operations.     The  whole  concept  of  integrated  data  processing 
stresses  the  importance  of  recording  information  at  its  point  of  origin  (or  point  of  first  recording)  in 
a  mechanically  re-usable  form.     This  is  necessary  in  order  to  speed  subsequent  processing  and  to 
avoid  the  errors  that  are  inevitable  at  each  stage  of  transcription  or  re -recording.     Investigations  of 
complete  machine  processing  of  hand-written  data  sheets,   which  may  be  the  original  form  of 
recording  of  experimental  data  later  to  be  incorporated  in  the  technical  literature,   are  also  in 
process.  ^J 

One  of  the  results  of  the  new  emphasis  on  the  speed  and  the  accuracy  with  which  input  data  can  be 
fed  into  automatic  data  processors  has  thus  been  to  provide  for  mechanization  of  information  at  the 
point  of  origin.     A  common  method  is  the  use  of  by-product  data  generation  or  of  "dual  language", 
e.g.  ,   the  generation  of  a  punched  paper  tape  with  the  necessary  data  about  a  transaction  produced 
automatically  in  the  proper  code  by  the  device  that  records  the  transaction.     The  information  is  thus 
recorded  simultaneously  in  two  forms,    one  of  which  is  immediately  legible  to  human  eyes  and  the 
other  of  which  is  directly  "legible"  to  processing  equipment. 

In  many  cases,   however,   the  number  and  geographical  dispersion  of  the  points  from  which 
necessary  information  originates  preclude  the  use  of  such  relatively  expensive  devices  as  Flexo- 
writers,   or  accounting  machines  with  punched  paper  output,    in  favor  of  the  ordinary  typewriter. 
At  present,    such  typed  information  must  usually  be  converted  to  machine -usable  form  (i.  e.  ,   as  holes 
in  punched  paper  tape  or  punched  cards,    or  as  magnetized  spots  on  magnetic    tape)  by  a  separate  key- 
board operation.     Considerable  advantage  may  be  gained  by  the  development  of  devices  which 
translate  or  convert  information  that  is  recorded  in  a  typed  or  printed  form  to  some  other  form. 
Preferably,   this  other  form  will  provide  direct  input  of  the  code  patterns  resulting  from  character 
identification  to  an  automatic  data  processing  system. 

Cash  register  tally  roll  accounting,   especially  in  retail  chain  store  operations  or  in  large 
department  stores,    is  a  further  example  of  situations  where  records  from  many  individual  locations 
are  subsequently  to  be  processed  in  a  centralized  operation.     This  is  an  area  of  applicability  that  is 
being  actively  investigated  both  in  the  United  States  and  abroad.     Figure  2,    for  example,    shows  a 
portion  of  a  tally  roll  record  which  was  read  by  a  Solartron  prototype  reader  in  demonstrations  at 
the  British  Computer  Exposition  held  at  Olympia  in  December   1958. 

Another  important  area,    considering  data  processing  operations  generally,    is  that  of  applications 
where  the  turn-around  document  is  an  integral  part  of  the  information  flow.     That  is,   as  we  have 


1/ 
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This  term  refers  to  documents  that  are  prepared,    sent  to  customers  or  other  users,   and  then 
returned  for  further  processing  and  automatic  re-entry  to,    say,   an  accounting  system  or  a 
record  file.     It  also  includes  source  documents  that  are  imprinted  at  remote  locations.     (See 
Keller,   A.  E.     Ref.  249,    p.  23.) 

Young,    D.A.     "Automatic  character  recognition,  "    Ref.  539,   p.  4. 

See  the  report  of  character  recognition  research  being  conducted  by  Hansche  and  Steck  at 
Sandia  Corporation  in    Ref.  506,    p.  101. 
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Figure  2.     Sample  of  Tally  Roll  Record  Read  by  Machine 
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noted,   the  turn-around  or  re-entry  document -is-ene  which,   having  been  processed  in  whole  or  in  part 
by  machine,   is  distributed  to  customers  or  users  and  is  later  returned  for  further  processing. 
Examples  include  utility  bills,    subscription  records,   traveller's  checks,   and  premium  payment 
notices.     In  a  variety  of  potential  applications  in  the  U.S.   Government,    which  have  been  investigated 
by  a  task  group  of  the  X3-1     Subcommittee  of  the  American  Standards  Association,   a  large  part  of  the 
workload  involves  forms  and  documents  to  be  filled  in  by  multiple  sources,    including  the  general 
public. 

The  turn-around  document  area  thus  specifically  raises  the  problem  not  only  of  multiple  sources 
of  data  origination,   but  also  of  multiple  methods  of  data  inscription.     These  methods  commonly  in- 
volve combinations  of  pre-printed,   addressograph-plate  inscribed,   typewritten,   business  machine 
prepared,   and,   to  a  lesser  extent,   handwritten  data  entries.     For  example,   of  46  different  Govern- 
ment applications  studied  by  the  X3-1  Task  Group,    83%  of  the  carrier  items  considered  had  combined 
addressograph,   business  machine  listing,   and  typed  information.     These  specific  potential 
applications  in  Government  amount,    in  the  aggregate,   to  over  2,  000,  000  separate  forms  which  must 
be  processed  daily._f/ 

In  addition  to  voluminous  input  of  source  data,    such  as  current  transactions,   many  large-scale 
data  processing  operations  also  require  the  input  of  previous  records,    prior  summaries,   and  detailed 
data  maintained  in  reference  files.     Where  such  files  have  previously  been  maintained  manually,   an 
enormous  data  conversion  problem  is  often  involved  in  key  punching  the  data  so  that  it  can  be 
transcribed  to  storage  media  acceptable  to  the  machine,    such  as  magnetic  tape.     Where  these  files 
are  collections  of  the  literature,   or  even  of  library  catalog  cards,   the  problem  is  even  more  severe. 

The  use  of  computers  for  mechanized  translation  also  emphasizes  this  need  for  automatic 
character  reading.     For  example,   Shiner  predicts  in  part  that: 

"The  ideal  system  for  performing  automatic  language  translation  could  have  a  printed 
character  recognizer  -which  would  automatically  scan  the  Russian  text,   recognize  the  printed 
characters.  .  .  and  transform  into  a  suitable  code.  .  .  at  speeds  capable  of  keeping  pace.  .  10,  000 
characters  per  second. "  _/ 

Russian  scientists  are  also  aware  of  the  need  for  automatic  character  reading  as  input  to  mechanized 
translation  programs.     Work  in  character  recognition  has  been  reported  at  the  Institute  for  Precise 
Mechanics  and  Computing  Techniques,   the  Steklov  Institute,   the  Kiev  Computing  Center,   and  else- 
where in  the  Soviet  Union.  ±1 

Potential  applications  of  automatic  data  processing  systems  for  mechanized  translation  may' in 
fact  depend  on  the  development  of  automatic  reading  techniques  for  the  re -recording  of  dictionaries 
and  for  the  input  of  documents  to  be  translated,   if  the  conversion  to  automatic  processing  is  to  be 
practical  at  all.     For  example,   Wall  */  points  out  that  the  development  of  an  automatic  reader  to 
transcribe  from  the  printed  text  of  the"  source  material  to  the  machine  language  used  is  crucial  if 
the  process  itself  is  to  be  economical.—/   This  is  because  human  transcribers  who  do  not  know  the 


1/ 

y 

3/ 


Data  from  an  informal  memorandum  by  J.H.    Cummins,   April  4,    1961,   to  the  Task  Force 
Chairman,   Sub-Committee  X3-1,   A.  S.  A. 

Shiner,   G.     "The  USAF  automatic  language  translator,    Mark  I,  "  Ref .  424,   p.  301.     Compare  also 
Reitwiesner  and  Weik,    "Survey  of  the  field  of  mechanical  translation.  .  .  ",    Ref.  380,   as  follows: 
"Problem  areas  requiring  attention  include.  .  .  character  sensing  of  printed  pages  of  source 
language  text  at  the  rate  of  the  order  of  thousands  of  characters  per  second.  .  .  ,  "  p.    40. 

See  Ware,   W.  H.  ,   et  al.     "Soviet  computer  technology,    1959,"    Ref .  524.     See  also  Refs.   21, 
156,    380. 

4/ 

—        Wall,   R.E.     "Some  of  the  engineering  aspects  of  the  machine  translation  of  language,  "  Ref.  523. 


1/ 


Pahl,   in  a  later  study  of  needs  for  character  recognition  for  the  mechanized  translation  program 
is  even  more  specific:     "At  present,   the  transformation  from  alphabetic  to  binary  code  is  done 
by  a  machine  operator  who  reads  the  text  letter-by-letter  and  depresses  corresponding  keys  on 
the  keyboard  or  a  card  or  a  tape  punching  device.     The  process  is  relatively  slow  and  thus  is 
responsible  for  roughly  60%  of  the  per  word  cost  of  machine  translation.  "    Ref.  350,   p.    4-5. 
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source  language  will  copy  slowly  and  will  make  many  errors.     Again,    the  volume  is  enormous.     In  a 
large-scale  mechanized  translation  project  being  conducted  by  Georgetown  University  for  the  United 
States  Government,   over  500,  000  words  of  text  in  the  field  of  economics  and  a  similar  500,000  words 
from  the  literature  of  organic  chemistry  have  already  been  transcribed  by  manual  operations.     It  is 
estimated  that  an  additional  3,  000,  000  words  of  text  will  have  to  be  keypunched  during  the  first  year 
of  production-type  operations  of  this  MT  system.—/ 

The  need  for  automatic  character  recognition  devices  to  provide  the  basic  input  to  files  of 
information  about  items  to  be  selected  and  retrieved  in  mechanized  document  retrieval  systems  has 
also  been  widely  recognized.     In  special  situations  such  as  that  of  the  Patent  Office  collections,    the 
volume  of  information  to  be  recorded,    transcribed,    and  stored  for  subsequent  search  may  be  more 
than  the  full  text  of  the  patents  themselves.     Koller,    Marden,   and  Pfeffer  point  to  this  need  for 
reducing  the  file -input-processing  bottleneck,   as  follows: 

"One  of  the  most  voluminous  and  demanding  tasks  in  setting  up  a  large  searching  system 
will  be  the  preparation  of  the  search  file  (the  library  file).     An  enormous  expenditure  of  man- 
power will  be  required  to  analyze  and  encode  the  documents  and  to  transcribe  the  codes  onto  a 
permanent  storage  medium.     Since  the  effectiveness  of  the  system  will  depend  largely  upon  the 
accuracy  and  completeness  with  which  the  file  is  prepared,    the  importance  of  this  task  cannot 
be  overemphasized.     This  job  will  ultimately  have  to  be  performed  by  machines  if  our  long- 
range  objectives  are  to  be  achieved.  "  JJ 

In  the  preparation  of  scientific  information  for  publication,    we  also  find  needs,   not  only  for 
recognition  of  typed  characters,   but  also  for  the  development  of  techniques  that  will  enable  the 
recognition  of  handprinted  or  handwritten  characters  and  symbols.     At  present,   the  repetitious  typing, 
copying,    re-typing  and  transcription  of  a  series  of  drafts  and  revisions  contribute  significantly  to  the 
costs  of  physical  preparation  of  papers  for  publication.     In  this  connection,    we  note  that  Russian 
scientists  are  reported  to  be  considering  the  use  of  character  recognition  devices  for  automatic 
typesetting.   ±1 

In  other  potential  applications,    incoming  textual  material  may  be  scanned  for  the  occurrence  of 
certain  key  words  or  phrases  which  are  then  used  for  mechanized  indexing  or  for  preliminary  routing 
or  classification.     This  latter  example  is  analogous  to  preliminary  sorting  of  incoming  mail,   where  a 
preliminary  subject-matter  or  field-of-interest  segregation  is  made  by  mailroom  personnel.     In  some 
operations,   machine  detection  of  certain  marks  or  symbols,    other  than  the  text,    can  be  used  in 
selective  retranscription  or  re-use  of  data,    such  as  the  reading  and  retranscription  of  those  portions 
of  teletype  messages  that  have  been  bracketed  by  a  human  editor. 

Similarly,    recognition  of  special  symbols  may  be  made  to  result  in  self-adjustments  of  the 
reading  process  such  that  the  detection  of  underlining  by  the  machine  may  be  used  to  cause  the 
machine  to  skip  the  reading  of  symbols  that  are  underlined.  _?/  Conversely,   the  reader-system  may 
be  designed  so  that  it  reads  only  underlined  material,   as  for  example,    the  specific  words  in  a  patent 
text  indicated  by  an  analyst  as  those  to  be  converted  to  appropriate  code  symbols  for  subsequent 
mechanized  search. 


1/ 

y 

2/ 

4/ 


As  reported  by  P.  P.    Toma,    "Brief  description  of  the  experiments  conducted  in  machine  trans- 
lation at  Georgetown  University,    Washington,    D.  C.  ,    with  special  view  on  lexicography  and 
lexical  information  theory.  "    Distributed  at  Kolloquium  uber  Maschinelle  Methoden  der 
Literarischen  Analyse  und  der  Lexikographie  in  Tubinger,    24-26  November   I960.     [Symposium 
on  Machine  Methods  of  Literary  Analysis  and  Lexicography.] 

Koller,   H.  ,    E.    Marden,    and  H.    Pfeffer.      "The  HAYSTAQ  system:    past,   present,   and    future,  " 
Ref.    260,   p.    334. 

Reitwiesner,    G.  W.   and  M.  H.   Weik.     "Survey  of  the  field  of  mechanical  translation  of  languages, 
Ref.    380,    p.    33. 

"It  will  be  noted  that  a  certain  amount  of  automatic  editing  may  be  expected  of  a  machine 
according  to  my  invention.     For  example,    the  machine  may  command  that  any  character  which 
is  underlined  be  omitted  in  transmission  to  the  output  mechanism.     This  is  accomplished  by 
simply  assigning  a  shape  and  memory  unit  to  recognition  of  an  underline.     When  this  shape  is 
recognized,   the  output  pulse  is  blocked  by  a  relay  or  tube.     Also,    the  insertion  of  special 
symbols  may  be  used  to  tell  the  machine  to  perform  such  functions  as  to  go  up  a  line  to  read 
a  word  inserted  in  the  space  between  double-spaced  lines  of  text,    to  go  immediately  to  the  next 
line  of  text,    or  to  stop  the  machine.  "  --Shepard,    D.  H.  ,    Ref.    415. 
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An  important  area  of  application  of  the  read-and-sort  possibilities  is  concerned  with  the 
recognition  of  addresses  on  mail,    so  that  the  envelopes  may  be  sorted  to  appropriate  destination- 
routing  bins.     The  Canadian,   U.S.  ,    Dutch,   and  German  Post  Office  Departments,   among  others,   have 
all  been  active  in  the  support  of  such  developments.     Continuing  support  is  exemplified  by  the  following 
announc  e  me  nt : 

"An  all-electronic  alphanumeric  recognition  device  for  identifying  typed  or  printed  envelope 
addresses  is  being  developed  by  Philco  Corporation's  Research  Division.  .  .for  the  U.S.    Post 
Office  Department.     The  machine  will  be  integrated  with  letter -sorting  machines  developed  by 
the  Post  Office  Department  to  further  the  automation  of  mail-handling  in  post  offices.     Electronic- 
scanning  techniques  will  be  used  to  locate  and  sort  as  many  as  50  different  addresses  at  the 
rate  of  1,  000  alphanumeric  characters  per  second.  "  _!/ 

Additional  areas  of  applicability  of  automatic  character  recognition  techniques  include  the  check- 
ing,  verification,   and  proof-reading  of  machine  output,    such  as  tables  of  numerical  values,   ballistic 
tables,    code-book  data,   printed  indexes,   and  bibliographical  lists.     The  use  of  high-speed  on-line  or 
off-line  printing  of  the  results  produced  by  computer  generation  and  processing  of  such  data 
emphasizes  the  need  for  high  speed  verification  processes  to  keep  pace  with  output. 

As  a  further  example  of  potential  applicability,   we  note  that  transliteration  (whether  from  the 
geometric  configuration  of  an  alphabetic  character  to  the  punched  hole  pattern  representing  that 
character  or  from,    say,   a  Cyrillic  script  character  to  its  English  language  equivalent)  can  frequently 
be  achieved  as  a  by-product  of  the  recognition  process.     In  effect,   the  symbol  that  is  read  is  decoded. 
The  identification  of  the  characteristics  of  the  input  symbol  as  a  specific  one  of  a  number  of  reference 
symbols  in  the  vocabulary  is  used  for  re-encoding.     Multiple  transliterations  can  be  made  as  desired 
from  the  single  identification  decision.     Thus,   a  reading  device  with  added  processing  logic  should 
enable  elaborate  decoding  and  unscrambling  of  security-coded  material.     In  this  connection,    as  well 
as  in  that  of  mechanized  translation,    it  has  been  noted  that:     "One  of  the  major  bottlenecks  in  the 
field  of  Intelligence  Data  Processing  is  the  translation  of  printed  matter  to  machine  language.  "  2/ 

In  general,    then,   there  are  many  situations  where  automatic  character  recognition  devices  are 
and  will  be  needed.     These  include  a  variety  of  business  data  processing  operations  in  which  punched 
cards  are  widely  used  today.     For  these  operations,    we  can  be,   as  Spiegelthal  has  said,    "hopeful  that 
keypunch  machine  operators  will  sooner  or  later  be  superseded  by  character  reading  devices.  "_£_' 
Even  more  optimistically,   the  president  of  the  National  Data  Processing  Corporation,    which  is  active 
in  both  magnetic  ink  and  optical  reader  developments,   has  predicted: 

"Now  the  technological  barrier  has  been  broken.     Machines  are  in  being  that  can  'read' 
and  process  information  on  documents  of  original  entry  and  sort  these  documents  into  any 
desired  arrangement.     Investment  is  more  than  justified  by  the  elimination  of  the  costly  in- 
efficiencies inherent  in  the  use  of  punched  cards.     Availability  of  these  machines  will  enable 
the  automation  of  business  operations  to  move  forward  at  an  even  more  rapid  pace  than 
heretofore  and  the  obsolescence  of  the  punched  card  is  at  hand.  "  _/ 

In  many  of  the  areas  of  potential  applicability,   however,    greater  flexibility  is  needed  than  is 
provided  in  currently  available  readers  which  have  a  limited  vocabulary,   often  of  specially  stylized 
design.     In  many  of  these  areas  the  quality  of  the  input  cannot  be  controlled  and  a  wide  variety 
of  sizes  and  styles  of  type  will  occur.     In  most  of  the  proposed  mechanized  translation  programs, 
natural  language  text,   as  printed  on  the  page,    is  the  necessary  input.     Thus,   as  Pahl  has  emphasized  : 

"For  machine  translation  and  related  applications,   the  need  is  for  a  reader  with  maximum 
flexibility.     That  is,    the  reader  must  be  able  to  recognize  a  large  group  of  characters  and 
resolve  a  multitude  of  problems  relating  to  variations  in  the  printed  material.  "  ±1 

A  printed  page  may  also  contain  subscripts  and  superscripts,    special  symbols,   and  graphic 
material  such  as  mathematical  equations,    chemical  structure  diagrams,    charts,    drawings,   and 
photographs.     In  such  cases,   a  temporary  solution  may  be  to  provide  certain  manual  pre-processing 
steps  such  as  the  sorting  of  material  by  font  or  the  masking  out  of  graphic  material  and  footnotes 
appearing  in  small  type. 


1/ 

y 

2/ 
i/ 
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"An  all-electronic  alphanumeric  recognition  device  for  identifying  typed  or  printed 
envelope  addresses.  ..."  Ref.    5,    p.    72. 

Stone,    W.  P.     "Alphanumeric  character  reader,"    Ref.    464,   p.    1. 

Spiegelthal,    E.  S.     "Computing  educated  guesses,  "    Ref.  438,   p.    70. 

Philipson,   H.  L.  ,    Jr.     "A  prediction.  ..  ",    Ref.  356,    p.  26. 

Pahl,    P.  M.     "Automatic  character  recognition.  "    Ref.  350,   p.    53. 

15 


Techniques  being  developed  for  automatic  character  recognition,   however,    should  also  eventually 
be  applicable  to  the  recognition  of  simple  geometric  shapes  and  schematic  stylized  graphic  material 
such  as  is  found  in  line  drawings  and  electrical  circuit  or  chemical  structure  diagrams.     The 
potentialities  for  automatic  recognition  of  graphic  information,    specifically  including  the  problems  of 
machine  encoding  of  chemical  structures,   have  been  considered  both  in  the  HAYSTAQ  system  and  in 
other  projects  of  a  cooperative  program  between  the  National  Bureau  of  Standards  and  the  Patent 
Office.   J-/   The  Perkin-Elmer  Corporation,    which  has  been  active  in  pattern  recognition  developments 
for  blood-cell  identification,   has  also  explored  the  problems  of  machine  recognition  of  chemical 
structure  diagrams.    2/ 

The  first  approaches  to  machine  recognition  of  simple  shapes  in  line  drawings  have  already  been 
demonstrated,   for  example,    by  Shepard,    Harmon,   Hodes,    and  Singer    Vamong  others.     At  the 
Western  Joint  Computer  Conference  held  in  Los  Angeles  in  May  1961,    Uhr  reported  additional  results 
from  a  pattern  recognition  program,    4/  including  recognition  of  outline  drawings  of  shoes,    chairs, 
and  comic  strip  cartoon  faces.     Fain,   a  Russian  scientist  working  in  the  field,   has  been  reported  as 
investigating  possibilities  for  recognition  of  three-dimensional  objects  by  a  technique  involving 
possible  projections  in  terms  of  a  grid  mask.    5/ 

Research  in  pattern  recognition,   while  not  typically  oriented  to  the  development  of  operational 
devices,   even  more  noticeably  points  to  additional  possibilities.     Areas  of  future  applicability  which 
are  being  explored  in  pattern  recognition  research  include: 

(1)  Identification  of  cursive  handwriting;  — ' 

(2)  Speech  recognition;  ]_/  „/ 

(3)  Recognition  of  Morse  Code  messages;  — ' 

— '       See,   for  example,    Kirsch,    et  al,    writing  in  1956,   as  follows:     "Another  intriguing  problem  is  to 
find  a  way  that  a  machine  could  'look  at'  a  diagram,    such  as  a  chemical  structure  diagram,   and 
characterize  it  uniquely.     The  work  to  date  has  not  been  concerned  with  the  more  symbolic 
information  that  appears  on  structure  diagrams,    such  as  element  symbols,    double  bonds,    etc. 
We  have  attempted  to  treat  only  simple  nets  composed  of  vertices  and  bonds  drawn  between  them. 
The  connection  pattern  has  been  treated  as  a  topological  net  and  we  are  not  concerned  with  such 
things  as  size  of  angles,    length  of  lines,    widths  of  lines,   and  line  breaks.     The  program  we  have 
been  working  on  will  first  locate  most  of  the  vertices  by  counting  the  number  and  extent  of  clumps 
of  'black'  spots  in  each  line  of  a  picture  in  both  vertical  and  horizontal  tracings.     Where  these 
numbers  change  between  successive  lines  a  vertex  is  indicated.     Then,    starting  at  a  vertex  the 
bonding  pattern  could  be  traced  from  vertex  to  vertex.  "    (Kirsch,    R.  A.  ,    L.    Cahn,    L.  C.  Ray  and 
G.  H.  Urban,    "Experiments  in  processing  pictorial  information  with  a  digital  computer,"  Ref.  257, 
p.  226.) 

— '       Private  communication,    E.W.    Schlieben,   April  19,    1961. 


3/ 


4/ 


6/ 

1/ 
8/ 


Harmon,    L.  D.    "A  line-drawing  pattern  recognizer,  "  Ref.  194.    Shepard,    R.N.    "Application  of 
IPL-V  to  the  simulation  of  perceptual  learning,  "  Ref.  420.    Hodes,    L.    "Machine  processing  of 
line  drawings,  "  Ref.  216.     Singer,    J.  R.    "Electro-mechanical  model  of  the  human  visual  system,  " 
Ref.  426.     Also,    "Model  for  a  size  invariant  pattern  recognition  system,  "    Ref.  429,   and  "A  self- 
organizing  recognition  system,  "    Ref.  430. 

Uhr,    L.    and  C.    Vossler.     "A  pattern  recognition  program  that  generates,    evaluates,   and  adjusts 
its  own  operators,  "  Ref.  497. 


5/ 

— '       Garmash,   V.A.     "Seminar  on  reading  machines,"  Ref.  156,    p.  12. 


Frishkopf,    L.  S.   and  L.D.Harmon.     "Machine  reading  of  handwriting,  "    Ref.  151.     See  also 
Refs.    98,    99,    192,    277. 

See,   for  example,    Fry,    D.B.    and  P.    Denes.      "Experiments  in  mechanical  speech  recognition," 
Ref.  153,   also  Ref.  152. 

Especially  the  Maude  program  at  MIT.     See  Gold,    B.     "Machine  recognition  of  hand-sent 
Morse  Code,"  Ref.  168,   also  Refs.    107,    295. 
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(4)  Target  detection  in  aerial  photographs;  }_/  ,  / 

(5)  Detection  of  patterns  in  electro-cardiograph  or  electro-encephalograph  recordings;  — ' 

(6)  Automatic  correction  of  misspelled  words  and  other  misprints;  3/ 

(7)  Fingerprint  identification;  4/ 

(8)  Recognition  of  specific  features  in  star  plates  or  bubble  chamber  data;  _' 

(9)  Recognition  of  features  in  microphotographs  of  sections  of  tissue,   metal  structure,   and  the 
like  .J/ 

It  has  been  suggested  that  the  development  of  machines  capable  of  translating  back  and  forth  auto- 
matically between  printed  and  spoken  language  would  simplify  the  telephone  system,    replace 
stenographers,   and  effect  great  economies  in  communication  channel  capacity,   as  well  as  giving  the 
blind  a  way  to  read  and  the  deaf  a  means  to  hear.  ]_/  It  has  also  been  suggested  that  mechanized    R  / 
pattern  recognition  will  be  important  in  space  exploration  so  that  a  probe  can  alter  its  trajectory  — ' 
on  recognizing  where  it  is. 

Ultimately,   then,   we  should  look  to  the  development  not  only  of  versatile  readers  capable  of 
handling  multiple  fonts,    special  symbols,   and  other  schematic  information  such  as  that  contained  in 
structural  or  electrical  circuit  diagrams,   but  also  of  pattern  recognition  techniques  capable  of  pro- 
cessing pictorial  as  well  as  textual  material. 

3.  CONTROLLED  SOLUTIONS  TO  CHARACTER  RECOGNITION  PROBLEMS 

In  view  of  the  very  broad  extent  of  the  areas  of  potential  applicability,   it  is  not  surprising  that 
many  different  solutions  have  been  suggested  for  problems  involving  the  possibility  of  automatic 
character  recognition.     When  specific  areas  of  applicability  of  automatic  reading  techniques  have  been 
identified  and  the  objectives  to  be  served  have  been  defined,   it  is  sometimes  quite  feasible  to  limit  the 
problem  of  developing  and  using  automatic  character  reader  systems  by  adopting  controlled  solutions. 
Such  controlled  solutions  include  specifically:     (1)    the  preprinting  of  the  input  material  that  is  to  be 
read,    (2)  the  establishment  and  maintenance  of  quality  controls  on  the  input  material,    (3)  the  adoption 
of  special  stylizations,    such  as  use  of  a  standardized  font,   and  (4)  the  limitation  of  the  nature  and  size 
of  the  vocabulary  to  be  recognized.     It  is  actually  for  situations  where  such  special  conditions  have 
been  imposed  that  most  of  the  progress  made  to  date  in  the  productive  use  of  reader  devices  has 
occurred.  Jj     Controlled  solutions  of  these  four  types  are  discussed  below. 


— '      See  Uhr,    L.     "Intelligence  in  computers;  the  psychology  of  perception  in  people  and  machines,  " 
Ref .  494,   p.  179;  see  also, for  example,  Murray,  A.  E.    "Perceptron  applicability  to  photo-interpre- 
tation, "  Ref.  320. 
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See  Bauer,    W.F.  ,   D.  L.  Gerlough,   and  J.  S.  Granholm.     "Advanced  computer  applications,  "  Ref. 
41;  Farley,    B.  G.  ,    L.  S.  Frishkopf,    W.  A.  Clark  and  J.  T.  Gilmore.    "Computer  techniques  for  the 
study  of  patterns  in  the  electroencephalogram,  "  Ref.  130;  Taback,    L.  ,    E.    Marden,   H.  L.  Mason 
and  H.  V.  Pipberger.     "Digital  recording  of  electrocardiographic  data  for  analysis  by  a  digital 
computer,  "  Ref.  471. 

See  Blair,    C.R.     "A  program  for  correcting  spelling  errors,"    Ref.  48.     Reitweisner  and  Weik 
report  that  D.  G.  Ellison  of  Indiana  University  has  worked  on  a  computer  to  read  printed  characters 
and  to  correct  misprints.     (Ref.  380,    p.  20. ) 

"Scanner  spots  fingerprints,  "  Ref.  401;  see  also  Sherman,   H.  ,    "A  quasi-topological  method  for 
machine  recognition  of  line  patterns,  "  Ref.  421. 

See  Innes,    D.  J.     "FILTER  --  a  topological  pattern- separation  computer  program,  "  Ref.  227} 
Metz^laar,    P.  ,    "Mechanical  realization  of  pattern  recognition,  "  Ref.  305;  and  Uhr,    L.  ,    Ref.  494. 

Tolles,    W.  E.   and  R.C.    Bostrom.     "Automatic  screening  of  cytological  smears  for  cancer;  the 
instrumentation,  "  Ref.  488. 

Miller,    G.  A.     "Speech  and  communication,  "  Ref.  308;  p.  397;  see  also  Young,    D.A.     "Automatic 
character  recognition,  "  Ref.  539,    p.  4. 


8/ 

—        Metzelaar,    P.     "Mechanical  realization  of  pattern  recognition,  "  Ref.  305,   p.  2,    (preprint). 


9/ 


Compare,   for  example,    De  Paris,    J.  R.     "Optical  character  recognition  equipment,"  as  follows: 
"The  secret  of  optical  scanning  lies  in  control  of  printing.     Virtually  any  situation  in  which  you 
control  the  printing  of  documents  at  the  source  is  susceptible  to  optical  scanning.     If  you  can 
control  the  physical  dimensions  of  the  document,   the  positioning  of  the  data  on  the  face  of  the 
document,   and  the  actual  printing  of  data  -  be  it  by  typewriter,   plate  imprinter,    line  printer,   or 
whatever-you  have  satisfied  the  primary  requirement  for  effective  use  of  optical  scanning.  " 
(Ref.    90,   p.    38.) 
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3.  1    Preprinting  of  Input  Material 

Preprinting  of  the  input  material  that  is  to  be  read  covers  two  general  cases.     First  is  the  case  of 
simultaneous  recording  of  data  in  machine -usable  form  as  well  as  in  typed  or  printed  form.     Second  is 
the  case  of  controlled  preprinting  using  special  inks,    special  fonts,    and  characters  stylized  in  such  a 
way  as  to  provide  maximum  legibility  for  the  machine  in  distinguishing  one  character  from  another. 

Included  in  the  first  case  are  various  systems  capable  of  by-product  data  generation.     Sometimes 
this  solution  involves  a  second,    simultaneously  produced  record  on  its  own  carrier  medium,    such  as 
the  production  of  a  punched  paper  tape  concurrently  with  the  production  of  typed,   hard  copy  originals. 
Sometimes  it  involves  the  imprinting,   on  one  and  the  same  document  or  carrier,   of  both  humanly 
legible  characters  and  of  the  same  information  in  encoded  form,    such  as  the  use  of  a  bar  or  dot  code. 
Special  inks,    ribbon  or  carbon  paper  may  be  used  to  provide  the  encoded,    'dual  language'  version  of 
information  that  is  also  simultaneously  recorded  in  conventional  form. 

For  potential  applications  in  information  selection  and  retrieval  systems  and  in  mechanized 
translation,  an  interesting  variant  of  this  first  case  is  the  preparation  of  special  punched  paper  tape 
as  an  intermediate  step  in  setting  text  for  Monotype  printing.  Another  example  is  the  use  of  a  tape- 
controlled  typewriter  in  the  initial  preparation  of  texts  of  abstracts,  producing  a  punched  paper  tape 
version  as  a  direct  by-product.  The  use  of  tape -typewriters  has,  for  example,  been  proposed  as  a 
method  for  cooperative  sharing  of  abstract  and  catalog  card  information  between  libraries  and 
information  centers.  ±J 

Monotype  or  Teletypesetter  punched  paper  tapes  are  or  could  be  made  available  for  certain 
printed  books,    certain  published  abstracts,   and  certain  periodicals.     Both  the  preprints  for,   and  the 
Proceedings  of  the  International  Conference  on  Scientific  Information,   held  in  Washington  in  November 
of  1958,   were  deliberately  processed  so  that  the  Monotype  tapes  would  later  be  available  to 
researchers  interested  in  automatic  indexing  experiments  or  in  machine  processing  of  natural 
language  texts.    2/    Unfortunately,   there  is  not  at  present  commercially  available  equipment  capable 
of  converting  from  the  30-channel  Monotype  tape,    with  special  format  and  special  symbols  to  control 
the  subsequent  type-setting  process  to  either  punched  card  code  or  to  conventional  computer  input  . 
media.     Prototype  equipment  for  conversion  either  to  punched  cards  3/or  to  punched  paper  tape  _' 
has  been  developed,   however. 

An  interesting  proposal  has  recently  been  made  by  the  Air  Force  Office  of  Scientific  Research  to 
compile  abstracts  of  basic  research  supported  by  that  office  for  issuance  both  in  book  form  and  in 
the  form  of  magnetic  tape  derived  from  the  by-product  paper  tape  produced  on  initial  typing.     Loan 
copies  of  such  tapes  are  to  be  made  available  to  workers  who  wish  to  experiment  with  the  recorded 
material,    on  condition  that  they  report  back  the  results  of  their  experiments,  jy 

The  second  case  of  the  typical  preprinting  solutions  to  automatic  reading  problems  is  that  of 
controlled  preprinting  using  special  inks,    special  fonts,   and  a  limited  vocabulary  of  characters 
stylized  so  as  to  provide  greater  machine  legibility.     This  second  case  includes  a  variety  of  operations 
exemplified  by  checkhandling  operations  in  banks,    where  the  serial  number,   account  number,   and 
other  information  may  be  preprinted  in  special  ink  (magnetic,    fluorescent,    phosphorescent,   etc.)  and 
in  either  a  standard  or  a  specially  designed  font.     In  1956,   the  Bank  Management  Commission  of  the 
American  Bankers  Association  approved  the  adoption  of  magnetic  ink  character  recognition  as  the 
basis  for  a  common  machine  language  most  suitable  for  check  handling.     This  action  was  based  upon 
the  report  of  the  Technical  Subcommittee  on  Mechanization  of  Check  Handling  which  had  conducted  a 
survey  of  the  then  available  character  recognition  techniques.—' 


1/ 

y 

2/ 
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Mooers,   C.N.     "The  'Tape  typewriter  plan',   a  method  for  cooperation  in  documentation,  " 
Ref.  315. 

This  deliberate  decision  included  a  specified  requirement  that  periods  ending  sentences  be 
followed  by  a  double  space  in  order  to  distinguish  these  punctuation  marks  when  used  in  this  way 
from  the  periods  used  to  indicate  decimal  points  or  abbreviations.     The  requirement  was  met  in 
spite  of  the  printer's  objections  on  grounds  of  typographical  aesthetics.    (Refs.  237,  238.  ) 

A  prototype  equipment  was  used,    in  part,   for  an  experiment  at  IBM  on  automatic  indexing  of  a 
small  portion  of  the  text  of  the  Preprints ,  International  Conference  on  Scientific  Information, 
Ref.  237. 

At  the  University  of  Cambridge.     Private  communications,    R.Wisbey  and  E.  Mutch,  Feb.  1961. 

Wooster,   H.     "Possible  availability  of  interdisciplinary  abstracts  on  magnetic  tape,  "  Ref.  538. 

See  Refs.    8,    9,    10,    11,    12,    13,    14. 
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The  ABA  recommendations  for  magnetic  ink  character  recognition  stressed  the  value  of  normally 
appearing  Arabic  characters,   the  ability  of  large  and  small  printing  concerns  to  use  the  inscription 
medium,   the  ability  to  imprint  by  a  variety  of  printing  means,   including  typewriters,   and  the  reading 
accuracy  demonstrable  with  then  existing  reader  devices.     The  choice  of  magnetic  vs.   optical 
reading  techniques,   however,    was  largely  because  of  the  weight  placed  on  the  evaluation  factor  of 
relative  insensitivity  to  over -marking.     Mutilations  and  obliterations  occur  in  check  handling  both  by 
the  public  and  by  banking  personnel.     It  was  considered  that  the  reading  equipment  must  be  capable 
of  recognizing  the  basic  information  even  when  obscured  by  overprinting  (teller's  stamp),   pencil  or 
ink  marking,   adhesive  tapes  used  to  mend  torn  checks,    and  other  "noise"  superimposed  by  oil, 
grease,   and  the  like,    in  order  that  the  rate  of  machine  rejections  be  held  to  an  acceptable  minimum. 

Both  in  America  and  abroad,   the  magnetic  ink  character  recognition  systems  are  gaining 
acceptance  in  banking  and  accounting  operations.     At  least  five  organizations  or  groups  are  actively 
preparing  machine  installations  for  test  by  the  Federal  Reserve  System.  _/     One  of  the  Federal 
Reserve  Banks  has  determined  that  better  than  12%  of  the  banks  which  it  serves  are  already  using 
MICR.2/     Representative  of  European  developments  is  the  Compagnie  des  Machines  BULL  3/ 
magnetic  ink  reader  demonstrated  for  the  Conference  of  European  Bankers  held  in  Paris  in  October, 
1959  and  presently  undergoing  trial  operation  in  a  Credit  Lyonnais  installation.     This  machine  reads 
preprinted  numeric  information  in  a  specially  designed  font  where  each  figure  consists  of  a  series  of 
seven  vertical  lines  spaced  either  0.  2  or  0.  4  mm  apart.     Interpretation  and  recognition  are  achieved 
on  the  basis  of  the  combination  of  widths  for  the  six  spaces  of  each  character,   in  effect,   a  multiple- 
cut  font.     Advantages  claimed  for  this  technique  include  the  normal  appearance  of  this  font  in  terms 
of  legibility  to  the  human  eye,   and  the  tolerance  in  printing,    since  the  amount  of  ink  in  each  line  is 
not  itself  the  basis  for  identification. 

The  FRED  (Figure  Reading  Electronic  Device)  System  developed  by  Electrical  and  Musical 
Industries  (E.  M.  I.  )  in  England  is  claimed  to  be  capable  of  either  magnetic  ink  or  optical  character 
recognition.     This  system  utilizes  a  specially  designed  typeface  style.    In  the  FRED  System, 
the  shape  of  each  character  conforms  to  a  fixed  distribution  of  variable  width  black  strokes  with 
respect  to  the  central  five  or  seven  vertical  columns  or  zones.    4/  A  German  prototype  reader  for 
the  American  E  13  B  font  is  reported  to  be  under  development  by  Telefunken  in  Germany.     5/ 

In  the  United  States,   use  of  the  MICR  system  in  situations  where  preprinting  occurs  is  proceed- 
ing so  rapidly  that  a  number  of  test  and  evaluation  centers  are  being  established  to  control  the 
quality  of  the  magnetic  ink  imprinting.    0/  An  address  on  testing  dexices  for  MICR  printing  was     _  / 
presented  at  the   1959  annual  convention  of  the  Lithographers  and  Printers  National  Association.   — ' 
This  address  reported  on  experiments  conducted  by  the  Office  Equipment  Manufacturers  Institute 
with  the  cooperation  of  a  number  of  printers,    demonstrating  that  for  a  given  press  and  ink  there  is  a 
good  correlation  between  the  'color'  or  blackness  of  the  printed  image  and  the  required  magnetic 
level. 

With  regard  to  possibilities  for  optical  rather  than  magnetic  preprinting,    similar  considerations 
of    possible  standardization  and  a  specially  designed  stylized  type-font  are  being  explored.     Since 


1/ 
i/ 

2/ 
V 
1/ 

6/ 


National  Cash  Register,    Burroughs,    Ferranti- Packard  and  Pitney-Bowes,   IBM,   the  National 
Data  Processing  Corporation. 

Emma,    Thomas.     "Federal  Reserve  to  test  five  systems,"    Ref.  122,    p.  35. 

"Direct  reading  for  data  processing,  "    Ref.  101. 

See  Figure  5(a),   p.  25. 

See  Auerbach,    I.     "European  electronic  data  processing.  .  .  ,  "  Ref.  21.     Auerbach  also  reports 
work  at  Siemens.     Note,   however,   that  a  distinction  should  be  made  between  MICR  reader 
developmen  ts  in  Europe,   and  other  European  work  in  optical  character  recognition,    discussed 
elsewhere  in  this  report. 

A  trade  publication  report,    "A  magnetic  ink  character  evaluation  center,  "  indicates  that  the 
General  Electric  Center  at  Phoenix  will  in  fact  use  a  'production  character-reader  sorter'  for 
evaluation  of  printing  quality;  (Ref.  288). 


7/ 

—        Miller,    G.  M.     "Testing  devices  for  magnetic  ink  imprinting,  "  Ref.  309,    pp.    4-5. 
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1959,   there  has  been  increasing  interest  in  cooperative  effort  to  study  these  problems  and  to  provide 
generally  acceptable  recommendations.     A  special  committee  on  "Optical  Scanning  Standards"  was 
formed  by  the  Retail  Research  Institute  of  the  National  Retail  Merchants  Association  in  August  1959. 

This  Optical  Scanning  Standards  Committee  has  recommended  optical  rather  than  magnetic 
character  recognition  systems  for  application  in  the  retail  merchandizing  industry  for  the  following 
reasons: 

1.  Field  imprinting  should  be  possible  with  less  expensive  and  more  reliable  imprinting  devices; 

2.  Re-entry  processing  should  be  possible  with  at  most  a  change  of  typeface  on  existing  printers; 

3.  Use  of  plastic  plates  (e.  g.  ,   for  customer  charge  recording)  could  continue; 

4.  Existing  equipment,    such  as  adding  machines,    cash  registers,   and  printers,    could 
continue  to  be  utilized  with  relatively  minor  mo difi cations. _r/ 

This  NRMA  Committee  has  maintained  liaison  with  other  industry  groups  studying  similar  problems, 
e.g.  ,   the  Air  Transport  Association  and  the  Life  Insurance  Automation  Committee. 

More  recently,    with  the  establishment  of  the  X-3  Committee  on  Automatic  Data  Processing  of  the 
American  Standards  Association  in  March  I960,   an  even  broader  interest  in  possibilities  for 
controlled  solutions  to  automatic  character  recognition  problems  has  developed.     Subcommittee 
X3-1  is  specifically  concerned  with  character  recognition  and  will  consider  problems  of  fonts  and      7  i 
formats,    printing  requirements,   and  applications  requirements  both  in  industry  and  in  government.—' 

It  is  to  be  noted  that  while  the  magnetic  ink  recognition  systems  are  largely  limited  to  operations 
where  preprinting  is  feasible,   the  optical  recognition  methods  are  applicable  not  only  to  preprinting 
solutions  but  also  to  situations  where  the  adoption  of  stylized  standard  fonts  provides  a  more  general 
solution.     Ultimately,   optical  recognition  techniques  will  be  applied  to  operations  involving  processing 
of  large  volumes  of  data  typed  or  printed  without  prior  restriction.     Thus  the  X3-1  Task  Groups 
exploring  applications  may  be  expected  to  investigate  not  only  business  data  processing  requirements 
but  also  those  requirements  which  involve  the  further  processing  of  enormous  quantities  of  previously 
printed  material,    such  as  is  found  in  the     ever-growing  body  of  scientific  and  technical  literature. 

We  note,   however,   that  preprinting  of  input  material  enables  extensive  control  of  document 
format,    carrier  characteristics,    characteristics  of  the  inscription,    placement  and  spacing  of 
information,   an  arbitrarily  limited  vocabulary,    choice  of  type  styles  including  those  of  special  design, 
and  other  critical  factors  affecting  production  and  performance  of  character  reading  systems  for 
business  and  industry. 

3.  2    Quality  Control  of  Input  Material 

Preprinting  of  input  material  as  a  controlled  solution  to  problems  of  automatic  reading  and  re- 
transcription  can  be  applied  only  in  those  limited  situations  where  there  is  administrative  control 
over  the  data -originating  processes  as  well  as  control  over  the  information  itself.     Generally, 
however,    the  data  to  be  read  is  a  direct  product  of  transactions  whose  timing,   occurrence,   volume, 
format,   and  specific  information  content  cannot  readily  be  predicted.     In  such  situations,    only  a 
limited  amount  of  the  data  can  be  preprinted,    at  best.     However,    if  the  organization  that  is  concerned 
with  subsequent  reading  and  use  of  the  data  is  also  administratively  responsible  for  the  data- 
originating  operations,   administrative  requirements  and  specifications  can  be  established  which  will 
have  the  effect  of  controlling  to  some  extent  the  quality  of  material  that  will  become  input  to  an 
automatic  reading  process. 

Quality  control  requirements  include,   first,   advance  prescription  of  the  information  carrier 
characteristics  such  as  the  size,    shape,    rag  content,    color,   and  other  characteristics  of  the  paper 
to  be  used  in  printing  or  typing  messages  or  data.     The  characteristics  to  be  specified  would  of 
course  be  selected  so  as  to  minimize  paper  handling  problems  and  to  maximize  ink-to-paper  contra-st 
and  definition  of  character  edges.     Secondly,    specifications  as  to  the  inscription  procedures  to  be 
used  might  include  requirements  for  daily  or  hourly  cleaning  of  typewriter  keys,    replacement  of 
typewriter  ribbons  after  a  specified  number    of  uses,    requirements  to  double-  or  triple-space  all 
material  to  provide  substantial  leading,   and  the  like.     Where  preprinting  of  the  information  is  not 
feasible,    it  may  nevertheless  be  practical  to  preprint  guide  lines,    such  as  boxes,    within  which 


— '       Sherwood,   H.  F.     "An  interim  report  on  optical  character    recognition,  "    Ref.  422. 
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See  Ref.    349. 
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information  mu§t  be  placed,    even  if  it  is  to  be  inscribed  by  the  general  public,   as  in  tax  returns. 
Requirements  for  ample  but  precise  margins  and  other  format  specifications  can  be  established. 
These  controls  would  minimize  problems  in  positioning  of  information  for  subsequent  scanning. 
Controls  on  the  size  of  the  total  symbol  vocabulary  to  be  permitted  in  the  system  can  extend  to  the 
prescribed  use  of  a  single  typestyle,   a  limited  number  of  characters,   or  to  the  requirement  of  the 
use  of  a  stylized  font  especially  designed  for  mechanized  reading. 

Quality  control  by  means  of  constraints  on  placement  and  shape  of  input  characters  has  also 
been  considered  in  the  case  of  handwritten  block  capital  letters  prepared  to  fit  (i.e.  ,   touch  the 
appropriate  edges  of)  a  prescribed  guide-box  frame    as  in  the  case  of  handwritten  numerals  inscribed 
by  telephone  operators.     If  the  operator  writes  down  the  digits  with  care  to  center  them  about  two 
preprinted  dots,   a  surprising  variability  in  the  actual  shape  of  handwritten  numerics  can  be 
accommodated  in  a  character  recognition  system.  \J  In  a  proposed  variation  of  the  FOSDIC  System, 
in  which  mark-sense  data  pattern  recognition  is  used  for  direct  input  to  processing  equipment,    2/ 
handwritten  numerals  so  drawn  that  distinctive  "tails"  project  from  the  top,   bottom,   or  sides  of  a 
preprinted  box  would  provide  a  basis  for  subsequent  identification.  _3/ 

Where  controls  over  the  data  originating  operations  are  impractical  or  too  costly,    some  degree 
of  quality  control  over    input  material  may  nevertheless  be  achieved  through  operations  at  the  stations 
where  data  to  be  read  are  received.     This  would  include  such  operations  as  culling  of  poor  quality 
materials  or  sorting  of  incoming  material  as  to  size  and  style  of  font  for    routing  to  different  readers 
each  of  which  reads  a  different  type  style.     Finally,   in  applications  where  for  other  reasons  (such  as 
compression  of  records  for  archival  storage),    photographic  reproductions  are  used  as  the  carriers 
in  the  character  reading  process,    certain  controls  can  be  applied  to  improve  quality.     For  example, 
operations  are  available  to  provide  enhanced  contrast  between  figure   and  ground  or  to  normalize  the 
size  of  different  lines  of  text. 

3.  3    Stylization  and  Standardization 

The  extent  of  dissimilarity  between  different  characters  in  the  same  symbol  vocabulary  is  a 
major  factor  in  the  elimination  of  ambiguity  in  a  recognition  process.       Therefore,   the  possibilities 
for  development  of  a  special  type  style  and  for  the  adoption  of  such  a  type  style  as  the  basis  for  a 
standardized  font  hold  considerable  promise  for  controlled  solutions  to  automatic  reading  problems. 
In  addition,    standardization  of  type  styles  and  fonts  would  have  many  advantages  in  design  of  equip- 
ment such  as  new  high-speed  printers,    in  procurement  of  typewriters  and  printers,   and  in  usage 
of  printing  and  typing  equipment. 

The  design  of  specific  type  styles  in  the  past  has  been  largely  influenced  by  traditional  and 
aesthetic  considerations  in  the  general  art  of  typography.     The  use  of  special  type  styles  in  proposed 
reader  devices,   however,   has  been  directed  primarily  to  the  production  of  characters  that  differ 
significantly  one  from  another  in  relatively  simple  aspects.     The  special  styles  developed  have 
included  superimposed  ornamentation  or  exaggeration  of  character  configurations  where  the  location 
of  the  superimposed  indicia  is  a  direct  clue  to  the  identification  of  the  characters.     Other  examples 
are  'cut  fonts'  where  breaks  in  normal  strokes  of  characters  provide  identity  clues  by  location  or  by 
width  of  cut,    and  fonts  designed  to  cover  certain  distinct  areas. 

Among  the  bases  for  character  differentiation  in  especially  designed  fonts  that  have  been  proposed 
are  the  following:    variations  in  the  total  width  of  the  character,   the  height  of  the  character,   the  total 
inked  area,    4/  the  width  of  individual  character  strokes,   the  number  and  position  of  strokes  between 
different  characters,   the  opaqueness  or  pattern  of  the  body  of  character  strokes  or  character 
elements,    W  and  the  geometry  of  the  character  such  that  there  are  unique  patterns  of  coincidence  of 
black  areas  (character  stroke  elements)  with  the  intersections  of  a  superimposed  grid. 
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See  Dimond,    Johnson,    Refs.    98,    240,   respectively,    and  Figure   16,    p.  49. 

See  Ref.    143. 

Greenough,    M.  L.  ,   private  communication,  November     15,    I960. 

The  patent  awarded  to  Davis  and  Hinton  (U.S.    Patent  2,  500,  630,    Ref.    86)  discloses  a  proposed 
reading  aid  for  the  blind  in  which  variations  in  total  .area,   transformed  as  relative  intensity  of 
reflected  light  to  a  photocell,   are  claimed  to  result  in  unique  displacements  of  a  D'Arsonval 
mirror  so  as  to  select  to  an  appropriate  sound  track  and  provide  an  audible  output  representative 
of  the  character  scanned.     It  is  to  be  noted,   however,   that  gross  area  variation  provides  an  in- 
adequate basis  for  discriminating  more  than  a  few  characters  if  they  are  shaped  in  anything  like 
normal  configurations. 

See,   for  example,   the  de  France  patent,    Ref.    88. 
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Figures  3  through  7  illustrate  some  of  the  special  stylized  fonts  that  have  been  designed  to 
facilitate  automatic  reading  by  machine.     The  examples  include:     special  fonts  for  use  with  magnetic 
ink  character  readers,    Figure  3;  special  fonts  from  the  patent  literature,    Figure  4;  special  area- 
covering  fonts,    Figure  5,  'cut  fonts',   figure  6;  and  special  fonts  for  optical  reading  equipment, 
Figure  7.     In  Figure  3,    examples  (a)  and  (b)  illustrate  magnetic  ink  fonts  developed  by  Stanford 
Research  Institute  for  the  Bank  of  America,    while  example  (c)  shows  the  E  13  B  special  font  currently 
used  in  the  MICR  systems  that  have  been  adopted  by  many  American  banks. 

Figure  4  shows  two  examples  from  the  patent  literature  of  uniquely    embellished  characters  or 
fonts  having,    in  Broido's  classification,   _1/  "external  code  marks.  "    The  upper  one,    that  of 
Heidinger,    2/  is  actually  more  of  a  'dual  language'  technique,    with  discrimination  to  be  based  on  the 
varied  sizes  and  locations  of  the  added  dots.     Example  (b)  shows  specially  stylized  characters  with 
discriminating  embellishments  as  disclosed  by  Dickinson,    et  al,    in  1937.   _3/ 

The  converse  technique  to  that  of  external  marking  and  embellishment  is  that  of  "internal 
positional  codes"  or  critical-area-covering  designs,    wherein:     "Recognition  can  be  achieved  by 
designing  each  character  so  that  it  possesses  unique  and  readily  recognizable  features.  .  .portions  of 
each  character  may  be  arranged  to  cover  a  predetermined  number  of  code  positions.  "  4/ 

A  vertical-zone  area-covering  design  is  used  in  the  E.  M.I.    "FRED"  system,    example  (a)  of 
Figure  5.       Figure  5  (b)  shows  the  critical  area-covering  font  of  Maul's  U.S.    Patent  2,294,  679,    in 
which  he  describes  the  technique  as  follows: 

".  .  .    The  graphical  characters,    such  as  for  instance  the  numerals,   are  so  selected 
in  their    configuration  that  in  accordance  with  an  index-point  system.  .  .a  different 
combination  of  black  and  light  analyzing  points  is  obtained  for  each  character,    whereby  .  .  . 
a  differential  control  of  the  machine  is  made  possible.  "_§/ 

Figure  6  represents  two  examples  of  the  so-called  'cut  fonts.  '    In  cut  font  stylization  there  are 
deliberately  placed  breaks  or  gaps  in  the  strokes  that  make  up  a  character  pattern.     Recognition  is 
based  not  on  the  character  pattern  itself  but  rather  on  the  detected  pattern  of  these  gaps  in  the 
normally  black  stroke  construction.     Detection  of  the  gaps,    either  by  number  encountered  or  by 
location  with  respect  to  specified  subareas  of  the  image  field,    thus  serves  as  a  code  upon  which 
identification  can  be  based.     In  other  words,   the  portions  of  the  character  that  are  missing  tell  what 
the  character  is. 

The  BULL  magnetic  ink  reader  font,    example  (a)  of  Figure  6,    is  a  special  case  of  a  multiple-cut 
font,   in  which  the  patterns  of  relative  widths  of  the  vertical  gaps  cut  through  the  entire  character 
from  top  to  bottom  serve  as  the  basis  for  recognition.     Cuts  may  be  so  placed  that  they  provide  direct 
binary  encoding--a  single  cut  in  the  lowest  level  zone,    "1";  a  single  cut  in  the  next  lowest  level,    "2"; 
cuts  in  both  the  lowest  and  next  lowest  levels,    "3",   and  so  on.     A  cut  font  design  of  Rabinow Engineer- 
ing Company  is  illustrated  in  example  (b)  of  Figure  6. 

Examples  of  special  fonts  used  for  optical  reading  equipment  are  shown  in  Figure  7.     Example  (a), 
the  Farrington-IMR  Scandex  "Selfchek"  font,    was  developed  originally  for  charga-plate  use  in  the  oil 
industry.     This  is  a  "matchstick"  type  of  construction,    a  "built-in  bar  code,  "    6/ in  which  the 
different  stroke  combinations  are  used  to  provide  a  minimum  of  two  stroke  differences  between  any 
two  numerics  in  this  type  face.     For  certain  applications  the  account  number  also  contains  a  redundant 
digit,    so  that,   for  a  single  error,   the  missing  digit  may  be  automatically  calculated.    "]_/ 

Example  (b)  of  Figure  7  is  that  of  the  National  Cash  Register  Company  font,    which  also  provides 
self-checking  features.     The  characters  are  constructed  by  considering  the  space  the  figures  are  to 
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Broido,    D.      "Recent  work  on  reading  machines  for  data  processing.     Part  1:    Masking,    scanning, 
and  external  coding  techniques.  "     Ref.    61. 

Heidinger,    W.  ,    U.S.    Patent  2,  362,  004,     Ref.    207. 

Dickinson,   A.  H.    and  J.  N.    Wheeler,   U.  S.  Patent  2,  261,  542,    Ref .  94. 

Broido,    D.      "Recent  work  on  reading  machines  for  data  processing,"  Pt.  II,    Ref.  61,    p. 224. 

Maul,    M.  ,    Ref.    303. 

Wentworth,    V.      "Farrington  has  optical  scanning  lead,  "    Ref.  530. 

See  Refs.    202,249,419,483. 
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Figure  6.      Examples  of  'Cut'    Fonts 
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occupy  as  divided  into  an  upper  and  lower  half,    each  of  which  may  contain  one  or  two  vertical  strokes 
in  any  of  five  stroke  positions.     Thus,    in  effect,   the  automatic  reader  will  view  each  character  as 
being  composed  of  two  five-bit  numbers.     For  machine  purposes,   this  design  is  actually  one  which 
requires  only  the  vertical  strokes,   the  horizontal  ones  being  added  merely  for  the  convenience  of 
human  readers.  _!/ 

Many  of  these  bases  for  character  differentiation  in  specially  designed  type  styles  have  definite 
limitations.     Those  which  have  to  do  with  the  gross  style  of  the  character,    such  as  the  width,   height, 
or  total  area  of  the  character,    either  presuppose  a  relatively  limited  vocabulary  (10-20  characters)  or 
require  high  resolution  in  the  scanning-recognition  system  to  detect  small  differences.     In  addition, 
for  typed  material,    considerable  variation  in  actual  area  of  a  character  often  occurs  even  on  the  same 
page,   and  character-image  area  variations  may  occur  not  only  as  effects  of  paper  quality  or  ink 
density,   but  also  as  effects  of  temperature  and  humidity.     Variations  in  specific  style  may  result  in 
character  forms  that  are  clumsy  in  appearance  and  therefore  may  be  difficult  for  the  human  viewer  to 
read  easily.     Variations  in  opaqueness  of  character  strokes  (such  as  the  use  of  differentially  spaced 
fine  lines  to  fill  in  the  body  of  a  character  element)  _/   may  have  limitations  similar  to  those  involved 
in  variations  of  gross  style. 

A  particular  embodiment  of  the  Weaver  character  reader  invention  for  telegraphic  transmission 
utilized  only  five  selected  sub-areas  of  an  image  field  for  character  discrimination.    3/  A  font  of  32 
characters  especially  designed  for  unique  black-white  patterns  for  the  five  positions  would  be  required. 
Weaver  illustrated  such  a  font  which,   for  particular  characters,    resulted  in  character  shapes  quite 
different  from  their  forms  in  a  normal  font.     One  of  the  more  promising  bases  for  stylized  characters 
for  use  in  a  standardized  font  ■was  studied  at  the  Diamond  Ordnance  Fuze  Laboratories  in  close 
collaboration  with  the  National  Bureau  of  Standards.    4/   This  is  based  on  a  5x7  character  element  grid 
that  is  particularly  well  adapted  to  both  mechanized  reading  and  the  design  of  high-speed  printing 
devices.     Many  of  these  devices  already  employ  elemental  printing  mechanisms  (capable  of  very  high 
speed  imprinting)  by  pressing  any  desired  combinations  of  pins  or  wires  from  a  cluster  of  35,    each  of 
which  corresponds  to  an  element  square  of  the  character  grid,    against  an  inked  ribbon  which  transfers 
the  impression  to  a  suitable  carrier  medium.     These  character  configurations  can  also  be  produced  by 
conventional  methods  of  printing  and  typing.     Figure  8  shows  a  method  for  construction  of  characters 
in  a  5x7  grid.     Individual  characters  are  designed  by  inscribing  circles  in  any  of  the  35  squares  of  the 
grid  and  by  drawing  tangent  lines  to  the  other  character  elements,   as  illustrated  in  Figure  8. 
Characters  of  such  shapes  can  be  readily  produced  in  the  elemental  printing  devices. 

A  recommended  font  for  business  machine  use,    including  typewriters,    is  illustrated  in  Figure  9, 
and  Figure   10  indicates  some  of  the  alternate  character  shapes  that  are  possible  in  this  font.     Very 
large  total  symbol  vocabularies  can  be  achieved.     This  font  has  been  used  by  RCA  in  certain  systems 
for  optical  character  recognition  now  under  development,    especially  for  large-scale  business  data 
processing  operations.     Similar  5x7  fonts  are  being  used  by  Eastman  Kodak,   Addressograph- 
Multigraph,    IBM,   and  others.     In  addition,    several  typewriter  manufacturers  have  indicated  the 
feasibility  of  supplying  a  type -face  incorporating  this  particular  style. 

The  Subcommittee  on  Character  Recognition  Standards  (X3-1)  of  the  American  Standards 
Association  has  reviewed  the  various  proposed  stylized  fonts.     The  5x7  grid  has  been  considered,   but 
with  modifications  both  to  make  the  style  more  pleasing  to  the  eye  and  to  accommodate  scanning 
features  of  several  different  readers.     As  of  February  1961,   the  preliminary  recommendations  were 
reported  to  be  for  5x9  grid  characters  of  uniform  width,    with  strong  right  edges,   and  with  minimum 
serifs.     5/ 

A  stylized  font  based  on  either  the  5x7  or  the  5x9  grid  lends  itself  readily  to  various  automatic 
scanning  recognition  techniques.      These  include  reader  designs  which  use  a  sequential  scan  and  also 
those  which  make  a  simultaneous  analysis  of  the  entire  image  field.     Since  the  character  shape  is 
preunitized,    by  reference  to  the  element  squares  of  the  theoretically  superimposed  grid,    the 
resolution  of  scan  may  be  arbitrarily  improved  at  the  points  of  critical  analysis  by  application  of  the 
same  detent  principle  that  is  involved  in  locating  gross  scan  reference  marks.     An  advantage  of  the 
5x7  or  5x9  style  for  both  automatic  reading  and  for  photocopying  or  microfilming  of  the  printed  or  typed 


1/ 
2/ 

2/ 

4/ 
5/ 


"This  code  'within  the  character1  achieves  the  goal  of  combining  two  languages  in  one.     The 
machine  reads  only  a  code  while  the  human  reads  conventional  printed  characters  slightly 
stylized."    (Ref.    346,   p.    28.) 

See  de  France,   H.     "Numbers  reading  device,  "    Ref.    88. 

Weaver,   A.      "Telegraph  reading  machine,"    Ref.    527. 

Rabinow,    J.     "Standardization  of  the  5x7  font,  "    Ref.    368. 

See  "Optical  recognition- -the  breakthrough  is  here,  "    Ref.    349,   p.  22. 
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Figure  8.     Construction  of  Characters  in  the  5x7  Grid 
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material  is  the  fact  that  the  thickness  of  the  character  lines  bears  a  fixed  relation  to  the  size  of  the 
character  and  that  the  width  of  the  lines  is  constant  for  a  particular  font. 

Figure   11  shows  a  preliminary  design  for  a  possible  standardized  font  that  presents  a  compromise 
which  adequately  meets  the  requirements  of  both  those  reading  techniques  that  are  based  on  stroke 
analysis  and  those  that  require  area-correlations.     It  follows  the  5x9  matrix  recommendations  under 
consideration  by  the  X3-1  subcommittee. 

General  adoption  of  a  standardized  type  style,   would  obviously  require  the  cooperation  and  agree- 
ment of  manufacturers  of  printing  devices,   type-setting  equipment,   and  the  like.     It  should  be  based 
upon  a  design  well-adapted  to  automatic  reading  as  well  as  being  easily  reproducible,   well-adapted  to 
both  conventional  and  high  speed  printing  devices,   and  clearly  legible  to  the  human  eye.     Even  more 
significantly,   however,   typefaces  of  arbitrarily  specialized  design  can  now  be  obtained  and  installed 
in  at  least  some  makes  of  ordinary  typewriters.     Further  standardization  will  require  widespread 
management  appreciation  of  the  advantages  to  be  gained  not  only  for  automatic  reading  but  for  other 
administrative  purposes  as  well. 

Alternatives  to  the  adoption  of  and  possible  standardization  on  one  or  a  few  type  styles  designed 
for  automatic  reading  have  included  proposals  to  arbitrarily  increase  the  size  of  characters  to  be 
read  (by  preprinting,   by  specification  of  equipment  to  be  used,    or  by  modifications  of  existing 
equipment),   or  to  increase  the  space  around  characters,   both  inter-line  and  inter-character.     It 
should  be  noted  that  increasing  the  space  between  characters  involves  more  extensive  and  more 
costly  alteration  of  existing  typewriters  and  business  machines  than  does  the  replacement  of  type 
bars  with  others  bearing  characters  in  a  specially  stylized  font.     In  addition,    increasing  the  space 
results  in  a  considerable  decrease  in  the  storage  density  of  carrier  documents. 

In  most  cases,   the  use  of  a  stylized  type-font  such  as  those  shown  in  Figures  9,    10,   and  11 
would  provide  the  advantages  of  both  preprinting  and  quality  control  of  material  to  be  read  by  auto- 
matic recognition  techniques.     Moreover,   these  advantages  might  be  gained  with  less  administrative 
complexity  and  at  a  cost  less  than  that  typically  involved  in  installing  equipment  that  simultaneously 
produces  copy  in  both  normal  printed  and  in  machine -usable  form. 

3.  4    Limitation  of  Vocabulary 

Second  only  to  maintenance  of  consistent  high  quality  of  input,   limitations  on  the  size  and  nature 
of  the  vocabulary  of  characters   to  be  recognized  affect  the  feasibility  and  cost  of  development  of 
specific  character  recognition  systems.     If  only  16  to  20  numerics  and  special  symbols  used  on  an 
accounting  tally  roll,    or  only  the  upper -case  characters  of  standard  teletype  font  are  to  be  read, 
practical  character  readers  can  and  have  been  designed.     Examples  are  the  Solartron  ERA 
(Electronic  Reading  Automaton)  system  and  one  of  the  early  Farrington-IMR  machines,    respectively. 
Automatic  reading  techniques  for  a  small  vocabulary  situation  have  been  demonstrated  where  the 
"characters"  in  the  limited  vocabulary  are  the  typed  names  or  abbreviations  of  a  selected  list  of 
cities  and  states.     Examples  again  include  an  early  Farrington-IMR  machine  and,   more  recently, 
machines  under  development  by  both  Farrington  and  the  Philco  Corporation  for  the  U.S.    Post  Office 
Department. 

For  the  small  vocabulary  situations,    relatively  simple  recognition  logics  are  normally  sufficient. 
Techniques  for  parallel  matching  of  input  characters  with  all  the  reference  characters  (i.e.  ,    20  to  50 
in  the  vocabulary)  are  also  quite  readily  available,   as  in  multiple -character  template  arrays  or 
decoding  matrices  for  sampled  waveforms.     In  this  connection  we  note  that  ".  .  .  a  successful 
character  recognition  machine  may  be  developed  without  the  necessity  of  being  concerned  with  funda- 
mental pattern  recognition  problems,  "  _1/  and  that  the  hardware  may  be  relatively  inexpensive  to 
build  and  easy  to  maintain.     Unfortunately,   however,   many  of  these  techniques  are  not  capable  of 
being  extended  to  significantly  larger  vocabularies  (i.e.  ,    more  than    several  hundred  individual 
character  forms). 

The  truly  large -vocabulary  situations  are  exemplified  by  requirements  for  reading  pages  of  text 
from  many  sources  and  perhaps  in  many  languages.     Here  we  must  deal  with  complete  alphabets  in 
different  fonts,    with  different  sizes  and  with  italic  variations  for  each  font.     In  addition  to  font  and 
size  variations,    special  characters  such  as  mathematical  symbols  and  interspersed  graphic 
material  may  appear  on  any  given  page.     In  just  a  few  of  these  situations,    controlled  solutions  may 
be  adopted  by  rigorous  limitation  of  the  particular  fonts  and  sizes  within  a  font  to  be  accepted  by  the 


— '       Kirsch,    R.A.      in  Ref.    49,    p.    234. 
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character  recognition  system.     A  compromise  solution  of  this  type  would  require  adequate  provision 
for  blocking  out  or  ignoring  the  unacceptable  material  if  such  material  occurs  on  the  same  carrier 
unit  as  that  which  is  accepted. 

Some  representative  limitations  of  acceptable  vocabulary  in  character  recognition  systems  that 
have  been  tested  are  indicated  in  Table  1. 

A  variety  of  scanning  and  sensing  methods  and  a  variety  of  different  principles  of  recognition 
logic  have  been  applied  in  the  limited  vocabulary  reading  systems  listed  in  the  table.     In  many  cases, 
the  particular  scanning  technique  and  recognition  logic  adopted  has  a  direct  effect  upon  the  size  and 
nature  of  the  vocabulary  that  can  be  accommodated.     On  the  other  hand,   as  we  have  previously  noted, 
such  factors  as  consistent  high  quality  of  input  usually  outweigh  differences  in  scanning  and  logic  in 
considering  the  use  of  automatic  character    recognition  techniques  in  a  particular  operational 
situation. 

Before  discussing  characteristics  of  representative  systems,   therefore,   we  shall  indicate  some 
of  the  variety  of  combinations  of  scanning  and  logic  that  can  handle  high-quality  input  and  we  shall  then 
review  various  operational  requirements  typically  encountered.     We  shall  consider  first  the  general 
process  of  character  recognition. 

4.  THE  CHARACTER  RECOGNITION  PROCESS 

Typically,   the  process  of  recognizing  an  object  consists  first  of  sensing  or  scanning  the 
characteristic  properties  of  that  object.     These  properties  occur  as  patterns  (e.g.  ,   of  color,    sound, 
or  geometric  configuration)  in  some  spatial  or  temporal  order,   in  some  appropriate  sensing  field, 
as  the  visual  or  auditory  field  in  the  human.     Upon  input  of  such  a  source  pattern,   various  operations 
to  isolate,    reinforce,    or  improve  the  source  pattern  may  occur,    transforming  it  into  an  input  pattern 
which  is  equivalent  to  the  source  pattern.     This  pattern  may  then  be  systematically  compared  with 
previously  stored  reference  patterns  in  order  to  determine  the  identity  of  the  input  pattern  or  to 
classify  it  as  being  a  member  of  a  class  of  patterns  that  have  been  previously  sensed.     Figure   12 
represents  a  schematic  diagram  of  these  and  subsequent  operations  that  may  be  involved  in  a 
generalized  recognition  process.     Before  considering  these  process  steps  as  applied  specifically  to 
automatic  character  recognition,   however,   let  us  first  consider  some  commonly  used  methods  for 
character  recognition  in  various  reader  devices. 

4.  1    Some  Common  Methods  for  Character  Recognition 

Figure   13  illustrates  in  simplified  form  some  of  the  commonly  used  methods  for  character 
recognition  employed  in  automatic  reader  devices  which  are  already  in  use  or  which  have  been 
demonstrated  under  laboratory  conditions.     These  common  methods  will  be  described,    with  simpli- 
fied examples,    in  the  following  discussion.     They  illustrate  various  possible  combinations  and 
implementations  of  the  processes  shown  in  Figure   12. 

4.  1.  1    Template  Matching 

Example  (a)  in  Figure   13  is  that  of  one  of  the  earliest  methods  considered  --  matching  the  input 
pattern  with  a  stencil,    mask,    or  template  --  which  means,    in  effect,   that  the  reference  patterns  are 
photographic -negative  images  of  the  possible  source  patterns  that  are  allowed  in  this  type  of  auto- 
matic reading  system.     The  character  may  also  be  matched  with  respect  to  the  white  areas  that 
should  surround  it  or  be  included  in  it.     Glover  gives  the  following  definition: 

"Template  matching  ...    is  defined  as  a  process  that  compares  a  group  of  templates, 
each  representing  a  specific  character,    to  a  sample  character  for  identification  based  on 
the  degree  of  matching  between  each  template  and  the  sample.     The  templates  are 
identical  copies  of  the  characters  themselves,    including  the  white  area  both  in  and  around 
the  black  portion  of  the  characters  within  a  square  that  is  large  enough  to  enclose  the 
largest  character. "    ±f 

This  method  is  also  known  as  a  'map-matching'  or  'mask-sensing'  technique.   — ' 


1/ 
2/ 


Glover,    E.B.      "Simulation     of  a  character  recognition  system  utilizing  a  general  purpose 
digital  computer,  "    Ref.    167,    p.    1  (preprint). 

See  Broido,  D.  "Recent  work  on  reading  machines  for  data  processing,  "  Ref.  61,  who  cites 
as  examples  the  Tauschek,  British  Thomson-Houston,  Handel,  Goldberg,  and  Bryce  patents. 
(Refs.   475,    476,    477,    60,    191,    169,    65.) 
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Figure   12.      A  Generalized  Recognition  Process 
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Figure  13.      Some  Common  Methods  of  Character  Recognition 
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Goldberg,  in  his   1931  U.S.    Patent,    1/  disclosed  this  principle  as  applied  to  statistical  operations 
such  as  selecting  and  counting  particular  records  identified  by  some  specified  combination  of  indicia, 
for  example,  various  alphabetic  and  numeric  symbols.     In  a  suggested  embodiment  he  visualized  the 
records  as  being  stored  on  a  positive  photographic  transparency  and  a  'search  plate'  with  the  negative 
images  of  the  selection  criteria  processed  as  follows: 

"If  the  transparency  containing  the  various  statistical  indications  is  now  run  through  the 
machine  in  such  a  manner  that  the  negative  coincides  with  the  transparency  a  complete 
coincidence  impenetrable  to  any  light  or  heat  radiation  will  only  be  possible  in  one  defined 
case;  this  will  only  occur  when  the  negative  bears  exactly  the  same  characters,   marks, 
figures,    etc.  ,   as  the  transparency  in  question,   the  only  difference  being  that  in  the  negative 
these  records  are  light. on  dark  ground,   while  in  the  transparency  they  are  dark  on  light 
ground.     A  certain  combination  has  thus  been  picked  out  of  a  large  number  of  others  with 
extraordinary  speed  and  reliability  hitherto  not  obtainable.     In  order  to  obtain  the  coincidence 
of  the  negative  with  the  transparency  they  can  be  either  brought  into  contact  (direct  super- 
position) or  be  projected  one  upon  the  other  (optical  superposition);  the  latter  method  being 
more  advantageous  as  the  mechanical  features  of  the  machine  are  simplified.  " 

Thus  we  have  the  nucleus  for  a  character  recognition  system  in  which: 

(1)  The  input  of  the  source  pattern  requires  the  positioning  and  illumination  of  the  carrier 
(such  as  paper)  on  which  the  character  is  imprinted; 

(2)  The  source  pattern  is  transformed  into  the  input  pattern  by  a  suitable  means  of  optical 
projection; 

(3)  The  projected  pattern  is  optically  superimposed  against  the  reference  patterns  which 
are  the  photographic  negative  images  of  the  characters  permitted  in  the  vocabulary 
of  the  system; 

(4)  When  the  input  pattern  exactly  coincides  with  some  one  reference  pattern,   light  is 
extinguished  to  a  photocell  located  behind  the  reference  pattern  mask;  and 

(5)  The  extinction  of  light  to  the  photocell,    in  combination  with  suitable  means  for  identi- 
fying the  particular  reference  pattern  for  which  the  exact  coincidence  occurred,   is 
used  to  effect  the  output  of  a  target  pattern,    such  as  punching  the  appropriate 
combination  of  holes  in  punched  card  or  paper  tape  that  is  the  code  for  the  character 
identified. 

The  statistical  machine  proposed  by  Handel  in  1931  — '     similarly  provides  that  successive 
comparisons  be  made  between  a  source  pattern  character  image  on  a  carrier  item,    such  as  a  card 
with  printed  numerical  information,   and  each  of  a  series  of  character  stencils.     The  reference 
pattern  stencils  are  arranged  on  a  rotating  disk,   with  photoelectric  apparatus  employed  to  respond  to 
extinction-type  coincidence.     When  coincidence  is  detected,   the  position  of  the  disk  triggers  a  tallying 
operation  for  the  proper  symbol  or  character  category.     In  still  another  example,   in  this  case 
directed  to  reading  aids  for  the  blind,    Sharpies  3/  discloses  the  use  of  a  drum  with  photoelectric  cells 
and  a  "system  so  focussed  as  to  project  an  image  of  the  proper  size  to  coincide  with  one  of  an 
alphabet  series  of  transparent  stencils.  " 

This  basic  system  continues  to  provide,    with  many  improvements  and  refinements  in  the  various 
process  steps,   a  common  method  for  automatic  character  recognition.     Early  work  at  the  Diamond 
Ordnance  Fuze  Laboratories  was  based  upon  the  improvement  of  optical  matching  techniques  used 
in  the  selection  system  of  the  Bush  Microfilm  Rapid  Selector  (which  also  utilizes  a  Goldberg-type 
principle).     Improvements  were  desired  that  would  minimize  the  limitations  of  a  direct  comparison 


—  Goldberg,    Emanuel.     "Statistical  machine.  "    Ref.    169. 

y  Ref.    191. 

3/ 

— '  Sharpies,   A.  R.     "Audible  reading  apparatus,  "    Ref.    414. 
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scheme    in  which  the  entire  field  of  the  input  image  was  matched  against  a  mask  containing  the 
complement  of  the  pattern  to  be  selected,   with  the  extinction  of  light  to  photocells  occurring  only 
when  exact  match  occurred.     Rabinow,   then  in  charge  of  the  DOFL  project,    recognized  that  the 
problems  of  differentiating  between  mismatches  and  light  leakage,   or  variations  in  the  relative 
contrast  between  dark  portions  of  the  input  pattern  and  its  lighter  background,   might  be  solved  by 
examining  the  input  field  in       small  sub-areas  one     at  a  time.     The  change  in  electrical  output  for  a 
true  mismatch  in  any  portion  of  the  image  is  significantly  greater  than  those  usually  occurring  from 
light  leakage,   background  noise,    or  imperfections  in  the  blackness  of  the  input  character  in  each 
small  sub-area. 

The  DOFL  First  Reader  —   utilizes  this  principle  of  scanned  comparison.     A  prototype  model, 
built  in  1954  merely  to  demonstrate  feasibility,    recognized  typewritten  characters  of  a  single  type- 
style  at  a  rate  of  one  character  per  minute.     In  this  model,   the  source  pattern  is  derived  from  the 
unknown  typed  character  symbol  and  is  transformed  into  the  input  pattern  by  optical  projection  of  the 
image  field.     This  input  pattern  is  scanned  by  a  modified  Nipkow  disk  and  is  then  compared 
sequentially  with  a  mask  containing  as  reference  patterns  the  photographic  negative  images  of  each 
of  the  characters  in  the  vocabulary.     As  each  scanned  comparison  is  tried  sequentially,   the  photocell 
output- connected  to  a  modified  peak  detector  produces  charges  in  a  set  of  capacitors  that  is 
proportional  to  the  quality  of  match  between  the  unknown  input  pattern  and  the  reference  pattern  (mask 
character)  against  which  it  is  projected. 

Recognition  elements  are  connected  to  the  capacitors  in  such  a  way  that  when  the  comparison 
cycle  has  been  completed  the  "best  fit  match"  is  directly  identified.     It  will  be  noted  that  the 
recognition  logic  is  simply  that  of  best  match  of  the  complete  input  pattern  as  a  single  input  pattern 
element  with  the  images  representing  the  vocabulary.     Minor  additions  of  logic  circuit  interlocks  to 
distinguish  between  an  "I"  covered  by  a  "T"  or  an  "F"  covered  by  an  "E"  are  provided.     The  trans- 
formation of  the  results  of  the  matching  process  to  the  observed  identification  formula  is  thus  an 
identity  transformation,   and  the  matching -recognition  thresholds  of  Figure   12  would  therefore 
coincide  for  this  system. 

Various  techniques  for  increasing  the  recognition  rate  of  this  type  of  template  reader,    e.g.  ,   by 
the  substitution  of  electronic  for  disk  scanning  and  by  simultaneous  rather  than  sequential 
comparisons  against  the  array  of  vocabulary  masks,    were  proposed  by  Rabinow  before  he  left 
Government  to  establish  his  own  engineering  concern.     Some  of  these  suggestions  have  been  sub- 
sequently explored  at  NBS,   as  described  below. 

A  demonstration  model  of  a  reader  for  typewritten  characters,   using  an  "adequate  fit"  or 
threshold  setting  scanned  comparison  method,   was  designed  and  built  by  Greenough  and  Gordon  of  the 
Electronic  Instrumentation  Section.  _/As  in  the  DOFL  Reader,   the  unknown  source  pattern  is  pro- 
jected optically  against  one  of  a  series  of  reference  patterns  that  are  photographic  negative  images 
of  the  characters  to  be  identified.     The  resultant  light  pattern  is  scanned  to  determine  the  exact 
degree  of  match  between  the  input  pattern  mask  and  the  reference  pattern.     Masks  representing  the 
vocabulary  of  reference  patterns  for  a  particular  type  font  are  changed  automatically  until  a  match 
is  obtained.     Electrical  indication  is  then  provided  for  the  particular  mask  that  is  being  tried  when 
recognition  occurs.     This  apparatus  differed  from  the  earlier  DOFL  mechanical  model  chiefly  in 
employing  electronic  scanning  and  modified  recognition  techniques.     As  a  result,   an  operating  speed 
at  the  rate  of  one  character  recognized  every  two  seconds  was  achieved  in  an  experimental  model. 
This  model  equipment  was  able  to  recognize  all  of  the  letters  of  the  alphabet  (upper  case)  and  all 
numbers  under  normal  spacing  conditions. 

In  this  demonstrator  model  of  a  "typereader",   the  masks  used  are  individual  frames  on  16  mm 
motion  picture  film  with  26  letters,    9  numbers,   and  a  blank  space  on  an  endless  belt  run  through  a 
modified  movie  projector.     The  light  passing  through  the  mask  opening  is  projected  upon  an  image 
dissector  television  pickup  tube.     Electronic  scanning  then  provides  the  input  pattern  in  the  form  of 
optical  signals  upon  which  recognition  can  be  based.     The  signal  is  first  passed  through  a  clipper, 
emerging  as  either  "black"  or  "white"  signals,    depending  upon  whether  ink  or  paper  is  seen  through 
the  mask.     Those  areas  where  white  paper  is  visible  through  the  transparent  portions  of  a  particular 
mask  are  summed  in  an  integrator.     When  the  integrated  total  mismatch  is  below  a  certain 
recognition  threshold  value,   this  mask  is  recognized  as  the  correct  one. 

In  the  reading  process,   a  telephone  type  selector  switch  moves  a  deflector  vane  that  is  used  to 
provide  a  small  horizontal  shift  of  the  unknown  character  with  respect  to  each  mask.     This  allows  a 
certain  amount  of  correction  for  horizontal  displacement.     Six  horizontal  scans  occur  while  the 
deflector  vane  moves  slowly  from  left  to  right,    so  that  there  are  "tries"  at  a  match  for  six  different 
horizontal  positions  of  the  unknown  character  with  respect  to  each  mask.     The  first  reference  pattern 


— '      Rabinow,   J.     "Report  on  the  DOFL  First  Reader,  "    Ref.    367. 

2/ 

—        Greenough,   M.  L.   andC.C.    Gordon.     "Technical  details  of  print  reader  demonstrator,  " 
Ref.    182. 
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in  the  vocabulary  for  which  recognition  occurs  directly  determines  the  identification  of  the  source 
pattern  character.     The  last  mask  in  the  series  is  entirely  opaque  so  that  "recognition"  must  occur, 
thus  providing  an  indication  of  nonrecognition  of  any  of  the  standard  characters.     Since  the  masks 
are  tried  sequentially  in  this  model,   the  sequence  of  matching  has  been  arranged  to  eliminate 
possible  ambiguities,    e.g.  ,   the  letter    "Q"  mask  occurs  before  the  "O"  mask  in  order  to  screen  out 
erroneous  recognition  of  an  actual  Q  as  an  apparent  O. 

Apparently  a  threshold  of  mismatch  exists  that  will  permit  discrimination  between  all  of  the 
letters  and  numbers  provided  that  the  quality  of  the  typing  is  fairly  good.     This  threshold  appears  in 
practice  to  be  somewhat  less  than  that  of  the  cross-member  which  distinguishes  upper-case  "G"  from 
"C".     However,   further  study  would  be  required  in  order  to  determine  the  permissible  range  of  limits 
for  this  threshold. 

Following  the  trial  of  the  breadboard  typereader,   an  improved  design  was  proposed  in  which 
comparisons  would  be  made  against  all  stored  characters  in  parallel  in  order  to  achieve  practical 
reading  speed.    W    The  proposed  reading  process  is  entirely  electronic  except  for  the  paper-moving 
mechanisms  to  bring  each  line  and  each  character  in  each  line  successively  into  the  field  of  view. 
The  field  of  view  is  illuminated  by  a  light  source  providing  about  20,  000  foot-candles  that  would  be 
seen  by  an  optical  system  consisting  of  a  lens  and  an  image  dissector  tube.     Transformation  of 
source  pattern  (the  image  field  in  view)  to  the  input  pattern  is  accomplished  by  converting  the  optical 
image  into  electrical  signals  in  accordance  with  an  applied  scan  pattern.     The  time -varying  signal 
derived  from  scanning  the  image  in  a  raster  which  covers  the  field  in  a  linear  grid  pattern  would 
then  be    compared  with  similar  signals  generated  by  similar  scanning  of  the  masks  of  the  characters 
stored  in  the  vocabulary.     An  integrator  for  each  mask  would  add  up  the  integrated    mismatch  each 
time  the  field  is  scanned  for  a  comparison,    so  that  the  integrator  giving  a  value  of  mismatch  less  than 
a  selected  threshold  value  will  be  that  which  designates  the  character  being  read. 

Various  portions  of  this  proposed  reading  system  were  then  tried  experimentally.     A  complete 
scanning  system  was  operated  in  conjunction  with  a  mask  scanner  for  a  single  character  in  order  to 
develop  comparison  circuits.     As  a  result  of  these  experiments,   a  circuit  was  developed  that 
automatically  adjusts  the  threshold  bias  used  for  recognition  to  a  value  midway  between  the  values  of 
signal  corresponding  respectively  to  the  ink  density  and  to  the  paper  surface.     Other  system 
improvements  that  have  been  developed  include  a  means  for  deriving  two  images  from  each  mask 
scanner,   one  for  recognition  and  one  for  register  control,   and  a  method  for  sensing  both  the  vertical 
and  horizontal  directions  of  misregister.     The  latter  results  then  activate  the  positioning  controls  of 
the  scanning  tubes  to  move  the  images  into  proper  register. 

This  experimental  work  at  DOFL  and  NBS  was  carried  to  the  point  of  constructing  a  prototype 
device  and  of  demonstrating  the  probability  that  a  successful  reader  could  be  constructed  on  these 
principles.     Several  commercial  organizations  active  in  the  field  of  automatic  character  reading  have 
continued  to  exploit  similar  techniques.     Rabinow  has  been  awarded  a  patent  disclosing  the  use  of  a 
multisided  prism  with  many  flat  faces,    each  of  which  reflects  the  illuminated  area  to  different         -  / 
positions  in  the  reference  pattern  storage  array  as  a  means  of  parallel  simultaneous  processing.   — ' 
Blum  has  reported  that  Schutkowski,   in  Germany,    invented  a  similar  technique  described  as  follows: 

"A  lens,    cut  with  facets,    simultaneously  projects  the  letter  to  all  matrices  of  the  alphabet. 
To  accomplish  this,   all  the  matrices  are  assembled  on  one  plate.     Behind  each  single  matrix 
there  is  a  photocell.     The  photocell  acts  when  there  is  exact  conformity  between  picture  and 
matrix.     The  letter  is  thereby  identified.     The  photocell  controls  the  Braille  or  sound  output 
of  the  reproduction  part.  "  3/ 

4.  1.2    Peephole  Template  Matching 

The  second  method  of  character  recognition  shown  in  Figure   13  (p.    35)  also  employs,   in  effect, 
a  template  or  mask,   but  one  in  -which  only  a  relatively  small  number  of  selected  sub-areas  of  the 
image  field  are  used  as  apertures  for  matching.     That  is,   it  serves  as  a  "definitive  grid,  "  2./  or 
weighted-area  mask.     This  method,    like  that  of  the  overall  character  image  template  matching,    is 
also  a  very  early  development  in  the  field,     dating  back  at  least  to  the  early  1930's. 


1/ 

2/ 
2/ 

4/ 


Cook,   H.  D.     "A  study  of  print  reading  systems  leading  to  a  proposed  reader  for  typewritten 
material,  "  Ref.    80. 

Rabinow,    J.     "Reading  machine,  "    Ref.    366. 

Blum,    W.     "Current  efforts  to  design  reading  machines,  "  as  cited  in  a  survey  of  reading 
machines,   Ref.    373,   p.   26.     See  also  Refs.  243,    404. 

See  Kharkevich,   A.  A.     "The  principles  of  the  construction  of  reading  machines,  "    Ref.    255. 
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As  we  have  previously  noted,    early  work  on  readers  for_machine  encoding  for  telegraphic 
transmission  led  to  two  closely  similar  patents  issued  on  July  28,    1931  to  Parker  and  Weaver 
respectively.     Parker  proposed  to  scan  character  images  sequentially  through  a  mosaic  mask  of 
100  apertures.  _V    The  Weaver  patent  2/  covers  a  modification  in  which  the  source  pattern  image 
field  is  scanned  through  a  plurality  of  selected  subareas  of  the  image  field,   a  "peephole"  template  or 
mask.     In  one  suggested  embodiment  of  this  patent  there  is  provision  for  the  use  of  only  five  sub-areas 
in  a  grid  superimposed  on  the  unknown  character.     These  are  so  chosen  that,   for  a  font  especially 
designed  to  cover  or  not  cover  these  critical  areas,   which  are  areas  of  optimum  discrimination,   there 
will  be  unique  combinations  of  black-white  patterns  for  32  characters  or  symbols.     Appropriate 
identification  formulas  for  derived  black-white  pulse  combinations  would  lead,   when  matched  with  the 
input  pattern  for  an  unknown  character,   to  the  triggering  of  means  to  perforate  paper  tape  with  the 
transmission  code  symbol  corresponding  to  the  character  thus  identified. 

A  similar  principle,   using  a  larger  number  of  selected  sub-areas,    so  that  the  input  pattern  is 
composed  of  a  particular  combination  of  a  number  of  black  or  white  input  pattern  elements,    is 
utilized  in  the  Burroughs -Control  Instrument  Typed-Page  Reader  (AN/FST-6)  developed  for  the  U.S. 
Army  Signal  Corps.  _■?/  Optical  scanning  is  used  and  character  recognition  is  accomplished  by  a 
process  whereby  the  character  is,   in  effect,    examined  through  an  array  of  apertures  which  is 
automatically  registered  with  respect  to  the  detected  location  of  the  character.     In  this  case,   a 
specially  designed  font  is  not  required.     The  critical  'apertures'  or  test  points  have  instead  been 
selected  by  careful  intercomparison  of  the  upper  case  characters  of  standard  elite  typewriter  font  to 
determine  ^optimum  match  and  mismatch  areas  that  differentiate  one  character  from  another.     This 
method  may  be  further  illustrated  by  reference  to  Figure   14,   which  shows  an  aperture  mask  or  peep- 
hole template  that  would  discriminate  between  the  six  Cyrillic  characters  shown. 

Implementation  of  the  method  in  the  case  of  the  AN/FST-6  Reader  involves: 

(1)  Input  of  source  pattern  by  directing  the  beam  of  a  flying-spot  scanner  onto  the  page, 
pickup  of  the  scan  information  by  a  photocell,   and  use  of  a  servo  system  to  center 
the  scan  at  the  bottom  edge  of  each  character, 

(2)  Transformation  of  source  pattern  to  input  pattern  including  both  'cleanup'  operations 
and  quantization,   that  is,   conversion  of  the  video  signals  to  either  black  or  white, 

(3)  Transformation  of  the  input  pattern  to  the  reference  pattern  format  by  presenting  the 
signals  to  a  'black'  and  a  'white'  matrix,   each  having  96  cores  which  serve  as  'look' 
or  analytical  points, 

(4)  Matching  of  input  pattern  elements  to  reference  pattern  elements  by  the  generation  of 
mismatch  signals  when  a  'black'video  signal  is  on  a  'white'  core,   and  vice  versa, 
where  differential  wiring  of  the  cores  provides  a  wired-in  standard  for  each  of  the 
characters  in  the  vocabulary  and  where  the  minimum  mismatch  voltage  for  a  particular 
wired-in  reference  pattern  serves  to  identify  the  unknown  input. 

In  an  interesting  variation  of  the  peephole  template  technique,   Stone  of  the  Rome  Air  Develop- 
ment Center  Laboratories  4/  developed  a  prototype     alphanumeric  character  reader  in  which  the 
peephole  mask  is  a  single  master  reference  pattern  composed  of  13  areas  of  different  shape  and  size. 
When  viewed  against  this  master  pattern,   which  Stone  refers  to  as  a  "space  matrix",    each  unknown 
input  pattern  is  scored  for  the  incidence  of  black  or  white  in  each  area.     A  decoding  table  or  matrix 
then  provides  an  identification  formula  consisting  of  the  specification  of  which  areas  must  and  which 
areas  must  not  be  black  for  each  of  the  vocabulary  characters.     A  given  character  identification 
formula  need  not  specify  the  requirements  for  all  areas,   and,    on  average,   only  half  of  the  areas  are 
required  for  recognition  of  any  particular  character. 

In  still  another  variation,   the  single  master  peephole  template  is  replaced  by  several  specially 
designed  templates.       Baumann,   for  example,   has  suggested  a  "weighted  area  scanning"  technique 
which  would  use  several  such  reference  patterns,    each  designed  for  a  particular  subset  of  the 
characters  in  the  vocabulary.    5/     We  shall  discuss  the  more  general  case  of  weighted  area  templates 
after  considering  some  of  the  other  recognition  methods. 
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Parker,   R.  D.     "Telegraph  reading  machine, ."    Ref..    352. 

Weaver,   A.     "Telegraph  reading  machine,  "    Ref.    527. 

See  Deckert,   W.W.     "The  recognition  of  typed  characters,  "    Ref.    87,   and  patents  issued  to 
Relis,   et  al,   Refs.   381,    382,    383. 

Stone,   W.  P.     "Alpha -numeric -character  reader,  "    Ref.    464.  p 

Baumann,    D.M.  ,   F.T.    Brown,   et  al.     "Character  recognition  and  photomemory  storage 
devices  feasibility  study,"    Ref.    43;  also  Ref .    44. 
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"Shchah" 


"V«V.M 


Yeh' 


"Virt" 


Yio' 


"Short  Ec" 


Figure  14.     Peephole  Template  For  Cyrillic  Characters 
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4.  1.  3    Coordinate  Description  Matching 

Just  as  was  the  case  for  the  two  template  methods  discussed  above,  we  find  that  the  method  of 
coordinate  description  matching  (Figure  13c)  was  also  a  very  early  suggestion  in  the  field.  Parker, 
whose  patent  we  have  noted  was  issued  on  the  same  date  in  1931  as  was  that  of  Weaver  \l ,  described 
his  own  invention  as  follows: 

"It  is  assumed  that  one  letter  at  a  time  is  passed  before  a  scanning  or  analyzing  device.  .  . 
Each  letter  is  illuminated.  .  .  by  a  source  of  light  and  a  perforated  disk  whereby  the  area 
covered  by  the  letter  is  scanned  in  parallel  lines;  that  is,   parallel  rows  of  small  sub-areas 
are  examined  in  sequence.     Since  each  letter  of  the  alphabet  and  each  of  the  other  commonly 
employed  characters  is  of  a  distinctive  shape,   a  different  arrangement  of  pulses  will  be 
produced  in  the  output  circuit  of  a  photo-electric  cell  associated  with  the  illuminating  and 
scanning  device.     The  output  of  the  photo-electric  cell  is  then  amplified  by  suitable  means 
and  is  passed  to  the  arm  of  a  distributor  having  a  series  of  segments --one  hundred,   for 
instance.     Each  letter  or  other  character  may  be  represented  by  a  certain  combination  of  these 
distributor  segments  and  when  that  combination  is  recorded  the  proper  selecting  magnet  or 
magnets  of  the  perforator  or  printer  will  be  operated.  " 

Thus  we  would  have,   in  effect,   the  superimposition  of  a  rectilinear  grid,   mosaic,   or  small- 
mesh  aperture  array  upon  the  image  field  in  which  the  source  pattern c haracter  appears.     The 
identification  formulas  for  given  characters  would  then  consist  in  the  description,   by  their  co- 
ordinates,  of  those  cells  or  sub-areas  that  are  black  or  that  are  white  when  that  particular  character 
is  sensed  through  the  grid.     This  coordinate  description  method  differs  from  the  two  previously 
discussed  template  methods  in  that  it  considers  the  entire  image  field  and  tests  each  area  of  that 
field  for  source  patterns  that  may  be  located  in  various  positions  in  the  field. 

Scanning  means  for  the  sensing  of  a  vertical  segment  of  the  image  field  are  typically  combined 
with  means  for  keeping  track  of  relative  position  within  each  such  segment.     This  combination, 
together  -with  timing  control  accounting   as  scans  are  repeated  for  successive  vertical  segments, 
provides  an  effective  equivalent  to  the  actual  superimposition  of  a  rectilinear  grid  on  the  image  field 
as  a  whole  and  the  subsequent  inspection  of  each  cell  of  that  grid.     Successive  segments  may  be 
obtained  by  moving  the  scanning  means  from  one  edge  of  the  image  field  to  the  other.     Alternatively, 
and  much  more  commonly,    segment  readings  may  be  obtained  by  moving  the  image  field  in  discrete 
steps  past  the  scanning  station,   e.g.  ,   by  movement  of  the  carrier  on  which  the  source  pattern 
appears.     Subdivisions  along  a  segment  or  slice  of  the  image  field  may  be  achieved  by  time -interval 
count  control  or  by  a  quasi-geographic  factor  such  as  which  specific  one  or  ones  of  a  column  of  photo- 
cells are  affected  by  the  portions  of  character  encountered  in  each  scan  segment.   _/    If  the  column 
of  photocells  has  more  photocell  members  than  a  normalized  range  of  heights  of  characters  in  the 
vocabulary  would  require,   then  suitable  adjustments  for  vertical  misalignments  of  source  patterns 
can  be  accommodated  in  the  system. 

The  use  of  similar  techniques  in  Russian  investigations  of  character  recognition  has  been 
described,   for  example,   by  Kharkevich,   as  follows: 

"The  scanning  process  consists  of  sequential  inspection  of  the  image  by  a  traveling 
beam.     The  trajectory  of  motion  for  the  beam  shall  be  called  the  scanning  line.     The 
scanning  line  is  specified  analytically  in  a  certain  coordinate  system  which  is    relative 
to  the  field  of  the  image.     The  simplest  forms  of  scanning  are  directly  related  to  definite 
coordinate  systems,    so  that  the  scanning  line  reproduces  a  coordinate  grid.  "    3/ 

Significantly,   the  coordinate  description  recognition  method  involves  in  most  cases  an  encoded 
template  as  a  reference  pattern*     Often,     in  fact,   the  encoding  results  in  a  string  or  sequence  of 
binary  symbols,   where  each  bit  position  represents  a  particular  cell  of  the  grid.     Thus,   for  example, 
in  an  8x9  array,    "some  advantages  in  richness  of  detailed  information  and,   possibly,   hardware  may 
attend  the  procedure  of  expressing  the  morphological  structure  of  characters  by  means  of  72-place 
numbers  of  the  binary  system,   with  each  0  or  1  of  the  expression  corresponding  respectively  to  a 
non-intersection  or  an  intersection  of  character  and  sensing  line  in  the  coordinates  of  the  sensing 
system.  "  jV 
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Parker,    R.  D.     "Telegraph  reading  machine,  "    Ref.    352. 

For  example,   a  Rabinow  Engineering  reader  uses  this  technique,    see  Ref.    369. 
Kharkevich,  A.  A.     "On  the  principles  of  designing  reading  machines",   Ref.   255,   p.    19. 
Boni,    C.     "Russian  type  study",    Ref.    55,   p.    6-8. 
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Whether  the  encoding  is  direct  or  indirect,  however,   the  encoded  input  pattern  is  typically 
matched  to  the  correspondingly  encoded  reference  patterns.     That  is,   the  matching  decisions  with 
respect  to  the  input-pattern  elements  are:    Is  this  cell,   with  coordinates  i,   j,   black  where  the 
reference  pattern  cell  with  the  same  coordinates  is  black,   or  white  where  it  is  white?    Even  in  the 
sell-publicized  Perceptron  _V   systems,   despite  the  emphasis  upon  'learning1  and  upon  the  randomness 
of  the  connections  between  S-units  (i.e.  ,    sensory-receptive  cells  of  a  retinal  mosaic)  and  A-units 
(association  cells,   between  which  and  the  sensory  units  random  connections  have  been  established), 
the  identification  formula  for  recognition  of  a  given  character  or  geometric  shape  will  depend  on  some 
particular  coordinate-description  in  the  A-unit  space  which  has  been  'reinforced'  for  'correct 
response',  however  dissimilar  in  shape  this  may  be  to  the  original  source  pattern.     In  the  Perceptron 
systems,   of  course,  as  in  many  reader  systems,   the  actual  identification  formula  for  any  recognizable 
symbol  may  be  that  of  a  highly  idealized  or  skeletonized  character. 

We  should  note  here,  however,   that  in  considering  these  first  three  common  methods  of 
character  recognition,   all  of  which  include  'templates'  in  one  form  or  another,   we  have  made  certain 
simplifying  assumptions.     In  particular,   we  have  tended  to  assume  that  the  source  pattern  (that  is  the 
unknown  character  to  be  recognized)  has  been  accurately  positioned  so  that  the  image  field  for  the 
input  pattern  coincides  with  the  area  defined  by  the  boundaries  of  any  character  to  be  recognized  in 
the  system. 

Such  an  assumption  is  not,   in  general,    supported  by  realistic  operational  requirements.     Even 
if  the  recognition  vocabulary  is  limited  to,    say,    16  numeric  and  special  symbol  characters  of  an 
adding  machine  or  of  a  cash  register  tally  machine,   or   to  the  characters  of  a  standard  typewriter 
font,   there  is  still  the  problem  that  lines  of  typed  or  tally-roll  characters  are  subject  to  mis- 
registration,  misalignment,   and  skew  of  the  individual  characters.     Thus,   as  Horwitz  and  Shelton 
observe: 

"One  of  the  outstanding  engineering  problems  encountered  in  the  construction  of  _  / 

character  recognition  devices  has  been  the  accommodation  of  character  misregistration.  "  — 

Frequently  encountered  anomalies  of  registration  must  be  accounted  for  if  automatic  character 
readers  are  to  become  truly  productive  tools  for  data  processing.     Even  for  conventionally  printed 
characters,   there  are  significant  character-to-character  variations  in  width  and  in  height,   notably 
including  lower-case  use  of  ascenders  and  descenders.     There  are  also,   for  page-reading  and  other 
multi-line  reading  operations,    significant  differences  in  size,   even  within  the  same  type-style,   that 
must  often  be  accommodated.     Moreover,   within  the  same  font,   relative  height-width  proportions  for 
character  strokes  may  vary  with  varying  size.     (See  Figure   15  for  some  of  the  features  and  terms 
that  are  commonly  used  in  identifying  differential  type  style  characteristics.)    There  are,   in 
principle,   two  classes  of  solution  to  these  and  related  problems. 

The  first  such  solution  is  to  provide  for  an  image  field  large  enough  to  accommodate  not  only 
the  tallest  and  widest  characters  that  will  occur  in  the  permitted  vocabulary,   but  also  some  pre- 
established  allowable  range  for  vertical  and  horizontal  displacements.     Unfortunately,   for  image 
fields  (i.e.  ,   input  pattern  possibilities)  larger  than  actual  source  patterns,   the  number  of  possible 
coordinate  descriptions  soon  becomes  astronomical  if  we  allow  for  variations  in  size,    rectilinear 
translation,   and  rotation.     Fain  2/  has  calculated  the  total  number  of  coordinate  descriptions 
possible  for  any  geometric  shape  in  a  given  image  field  that  is  divided  by  a  superimposed  grid  to 
lie  between  1/6  it  kz  and  it  k^    where    k    is  the  area  of  the  retinal  grid  in  terms  of  the  number  of 
cells  in  the  grid.     He  has  used  this  estimate  as  a  major  criticism  against  character  (and  geometric 
shape)  recognition  schemes  such  as  those  of  the  Perceptron,   Taylor,   Vand  others,   which  depend 
upon  direct  or  derived  coordinate  descriptions  as  recognition-identification  formulas. 


1/ 
i/ 

11 
1/ 


See  Rosenblatt,   F.     "The  Perceptron:    a  perceiving  and  recognizing  automation  (Project 
Para),"    Ref.  393.     See  also  Refs.    52,    66,   83,   241,   242,    321,    392,    394,    396,    397,    398. 

Horwitz,   L..  P.  ,   and  Shelton,   G.  L.  ,   Jr.     "Pattern  recognition  using  autocorrelation,  "  Ref.  220, 
p.    175. 

Fain,    V.  S.     "On  the  quantity  of  coordinate  descriptions  of  the  image  in  systems  designed  for 
recognition  of  visible  objects,  "    Ref.    128. 

Fain,    V.  S.     "On  the  principles  of  designing  a  machine  for  recognizing  images",    Ref.  127.     For 
Taylor's  work,    see  Refs.    132,    338,    478,    479,    480,   481,    482.     We  should  note,   of  course,   that 
both  Rosenblatt's  experiments  with  Perceptrons  and  Taylor's  work  in  pattern  recognition  are 
directed  primarily  to  the  investigation  of  self -organizing  systems  and  to  possibilities  for 
machine  simulation  of  learning-type  processes,    rather  than  to  the  development  of  practical 
character  reading  equipment. 
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Figure  15.      Terms  Used  in  Identifying  Type  Style  Characteristics 
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Similarly,   Young,   in  commenting  on  Taylor's  recognition  system,    concludes  that: 

"An  inordinate  amount  of  logic  would  be  required  to  analyze  the  summations  of  all 
possibly  semantically  equivalent  combinations  of  sensory  data,    derived  from  a  sensory 
matrix  sufficiently  large  to  permit  of  the  necessary  discriminations.  "  ±J 

Metzelaar  has  put  the  same  general  problem  in  the  form  of  an  analogy  in  which  he  compares  the  degree 
of  mesh  necessary  to  pick  out  the  smallest  level  of  detail  necessary  for  discrimination  (e.  g.  ,   the 
diacritics  shown  in  Figure   14    to  grains  of  sand,   and  concludes  that: 

".  .  .today's  average  recognition  machine  would  attempt  to  recognize  a  mountain  by 
examining  each  and  every  grain  of  sand  in  relation  to  its  neighbors  to  build  up  the  pattern.  "  U 

A  second  class  of  solutions  to  the  problem  of  registration  of  the  image  within  the  coordinate 
description  field  is  directed  primarily  to  the  problem  of  rectilinear  displacement  or  translation  rather 
than  to  those  of  rotation  or  expansion.     In  this  class  of  solution  we  find  the  successive  horizontally- 
displaced  'tries'  for  a  match  in  the  DOFL.-NBS  optical  template  readers.     We  also  note  as  belonging 
to  this  class  of  solution  the  'finding'  of  a  boundary  edge  (bottom,   top,   etc.)  to  serve  as  a  reference 
mark  against  which  to  superimpose  a  peephole -template.     This  operation  tends  to  limit  the  image 
field  to  the  outermost  expected  limits  of  the  widest  and  tallest  characters,   as  in  the  Burroughs - 
Control  Instrument  Page  Reader.  1/    Other  techniques  are  used  both  in  optical  and  in  magnetic  ink 
character  scanning  in  order  to  physically  center  the  source  pattern  as  it  is  transformed  to  an  input 
pattern,    either  in  the  actual  center  of  a  field  corresponding  to  the  reference  pattern  image  field, 
or  by,   in  effect,   moving  the  input  pattern  until  it  touches  one  or  more  designated  edges  of  such  a 
field. 

Several  systems  use  various  types  of  processing  of  input  patterns  through  a  suitably  designed 
shift  register,    either  to  allow  for  various  possible  coordinate  descriptions  of  the  same  input  pattern 
under  rectilinear  translation  or  to  provide  a  position-normalizing  feature  such  as  centering  before 
the  matching  operations  are  attempted.     The  principles  involved  are  exemplified  in  a  prototype  reader 
developed  at  Standard  Elektrik  Lorenz,   and  demonstrated  at  Hamburg,   Germany,   in  1959.  zJ  In  this 
machine,   a  shift  register  technique  is  used  for  systematically  moving  the  input  pattern  through 
various  possible  coordinate  description-positions.   2.1    This  is  done  until  the  input  pattern  is 
sufficiently  'centered'  for  match  with  the  reference  characters,    say,   until  a  cell  or  cells  in  the 
upper  lefthand  corner  (column  1  and  row  1  of  the  grid)  show  black.     If  no  recognition  occurs,   the 
process  is  repeated,    shifting  by  pulses  (as  derived  from  the  input  black-white  quantized  scan)  to  the 
left  and  to  the  top,     for  another  try.     Since  this  machine  is  designed  for  the  reading  of  typewritten 
numerals,   the  shifting  technique  is  intended  not  only  to  control  the  displacement  problem  but  also  to 
alleviate  recognition  problems  caused  by  badly  smudged  characters. 

Related  possibilities  obviously  include  shifting  a  given  input  pattern  through  all  positions  until  a 
match  with  a  reference  pattern  occurs.     These  focus  sing -down  or  shifting -through  solutions,   however, 
are  not  necessarily  adaptable  to  problems  of  size  variation  or  of  tilt  or  skew.     The  Rabinow 
Engineering  Company  has  proposed  a  method  that  might  rather  ingeniously  accommodate  a  limited 
number  of  variable  coordinate  descriptions  as  well  as  certain  different  styles  of  some  specific 
character  (i.e.  ,   a  Roman  "A",   an  italic  "A",   a  Greek  "A",   etc.).     This  proposal  is  based  on  an 
adaptation  of  the  "Peek-a-Boo"  technique  that  has  become  familiar  in  some  nonconventional 
information  retrieval  systems.   £./    More  specifically,    the  proposed  reader  involves,   first,   a  fixed 
matrix  of  photocells  onto  which  the  source  patterns  are  projected.     The  associated  reference  pattern 
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Young,    D.  A.     "Automatic  character  recognition",    Ref.    539,   p.    4. 

Metzelaar,    P.     "Mechanical  realization  of  pattern  recognition,  "    Ref.    305,   p.    13,    (preprint). 

See  Refs.    57,    79,    87,    381,    382,    383. 

See  Dietrich,   W.     "The  automatic  recognition  of  typewritten  numbers",   Ref.    96;  also  Refs.   97, 
443,   444. 

Similarly,   in  the  IBM  1210  Sorter/Reader  for  E13B  magnetic  ink  characters,    the  input  pattern, 
which  is  stored  in  a  10x7  matrix,   is  'rolled'  up  through  each  of  the. 10  rows.     See  Ref.  126. 

See  Refs.    535,    536.     Note  also  that  Bledsoe  and  Browning,   Ref.    51,   also  apply  a  modified 
"Peek-a-Boo"  or  term-entry  type  system,   by  dedicating  bit  positions  of  stored  computer  words 
to  the  presence  or  absence  of  a  bit,   for  a  given  n-tuple  matching  test  for  a  previously  seen 
input  pattern,   to  the  category  to  which  the  previously  seen  pattern  belongs. 
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storage    is  comprised  of  a  stock  of  movable  cards  or  plates  which  represent  the  modified  "Peek-a- 
Boo"  system. 

Each  photocell  of  the  reading  matrix  activates  the  motion  of  one  card  into  one  of  two  positions, 
slightly  displaced,    depending  upon  whether  the  light  from  the  photocell  exceeds  a  specified  threshold 
value.     In  the  stack  of  cards  there  is  another  matrix  having  one  or  more  fields  or  hole  areas  for  each 
character  to  be  recognized.     Light  is  impressed  at  one  end  of  the  stack,   as  the  input  pattern  activating 
the  reading  process  has  been  processed,   and  there  is  a  second  matrix  of  photocells  at  the  other  end  of 
the  stack.     These  latter  photocells  comprise  a  recognition  matrix,    with  either  one  photocell  for  each 
character  to  be  identified,    or  better,     n    cells  can  be  used  to  identify  2n  characters. 

If  there  are    n    reading  (sensing)  photocells,   there  are    n    cards,    and  2     card  position 
combinations.     If  there  are   @   characters  to  be  recognized,   then  there  are    n    recognition  photocells 
where  2n  =    £     .     For  any  printed  character,    there  may  be  (within  economic  limits)  arbitrarily  many 
combinations  of  card  positions  which  give  recognition,    corresponding  to  different  fonts,   upper  or 
lower  case,    skewed  characters,   or  character  images  displaced  in  either  horizontal  or  vertical 
direction.     As  a  character  source  pattern  is  moved  across  the  scanning  field  (or,    conversely,   as  the 
scan-read  mechanism  is  moved  across  the  character  image  field),    recognition  could  be  arranged  to 
occur  at  many  points,   and  the  actual  identification  could  be  effected  on  a  probability  basis  in  order  to 
allow  a  limited  number  of  false  recognitions  to  occur  without  necessarily  spoiling  the  overall 
accuracy  of  such  a  recognition  system. 

Fitch  has  proposed  still  a  different  attack  on  the  problem  of  the  many  different  coordinate 
description  templates  necessary  to  accommodate  variable  configuration  and  registration  of  the  same 
character.  _l/    He  considers  first  a  binary  encoding  of  those  coordinate  positions  which  are  black 
for  a  given  source  pattern  presentation.     The  coded  input  pattern  is  then  fed  to  a  cross-grid  network 
analyzing  circuit  which  effects  a  decision-tree  type  of  recognition  decision.     The  paths  through  the 
network  constituting  the  reference  pattern  for  each  vocabulary  character  are  set  up  by  the  use  of  wire 
contact  relays  having  pluggable  jumper  wires.     In  each  reference  pattern  network,    100  or  more  paths 
representing  permissible  variations  in  coordinate  descriptions  of  the  same  character  may  be 
established.     The  pluggable  jumper  wire  technique  permits  the  establishment  or  subsequent  modifica- 
tion of  particular  paths  based  upon  empirical  observations  of  sample  characters. 

Other  interesting  variations  on  the  coordinate  description  method  include  the  disregarding  of 
selected  'grey'  cells  or  sub-areas  (those  which,   for  a  particular  vocabulary,    do  not  serve  to 
distinguish  one  character  from  another),    2/   or  the  differential  weighting  of  results  from  different 
sub-areas,    3/  to  form  what  Kharkevich  terms  a  "definitive  grid.  "_4/    Thus,   for  example,   in  the 
Taylor  proposals,   it  is  claimed  that:- 

"To  make  use  of  the  fact  that  the  detail  important  for  recognition  is  located  in  regions 
where  the  signal  amplitude  changes  appreciably,   it  is  necessary  to  attach  greater  weight  to 
signals  derived  from  such  regions.     This  weighting  process  will  be  called  'detail  filtering', 
and  the  detail-filter  networks  for  performing  this  operation  are  believed  to  be  similar  to  the 
neural  networks  that  produce  a  similar  effect  in  the  nervous  system.  "    .§/ 

These  methods  are  similar  to  the  critical  or  significant  area  techniques  in  certain  of  the 
peephole -template  methods.     Still  another  method,   that  of  Bledsoe  and  Browning,   utilizes  a  number 
of  arbitrarily  chosen  n-tuples  of  grid  cells  (exclusive  pairs,   triples,   etc.  ,    randomly  located  in  the 
field)  to  derive  black-white  'scores'  for  an  input  pattern,   which  are  then  compared  against  the 
previously  derived  scores  for  reference  patterns.    W 
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Fitch,   C.  J.     "Character  sensing  and  analyzing  system",   Ref.    135. 

See  Refs.    225,   226,    521. 

See  Refs.    478,   479,    482. 

Kharkevich,   A.  A.     "The  principles  of  the  construction  of  reading  machines,"  Ref.    255. 

Taylor,  W.K.  "Pattern  recognition  by  means  of  automatic  analogue  apparatus,  "  Ref.  482, 
p.  204.  For  similarity  to  phenomena  in  living  organisms,  see  Lettvin,  J.  L.  ,  et  al.  "What 
the  frog's  eye  tells  the  frog's  brain",    Ref.   269. 

See  Bledsoe,   W.  W.   and  I.    Browning.     "Pattern  recognition  and  reading  by  machine", 
Ref.    51. 
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Multiple  reference  patterns  are  necessary  (and  allowed)  in  the  Bledsoe  and  Browning   system  for 
the  simple  reason  that,    with  rotation  and  translation  and  size  variance  (expansion  and  contraction),   the 
type  of  n-tuples  suggested  would  saturate  and  discriminability  would  be  lost.  _'    Bomba,   2/  however, 
suggests  that  n-tuples  based  upon  relatively  invariant  feature  assumptions  can  be  used  to  preserve  or 
increase  discrimination,   and  with  obviously  greater  economy  in  storage  of  reference  patterns  or 
reference  pattern  elements. 

4.  1.  4    Characteristic  Waveform  Matching 

The  characteristic  waveform  matching  method  (Figure  13d)  is  perhaps  best  exemplified  by  a 
Stanford  Research  Institute  reader  for  magnetic  ink  recognition  for  use  in  ERMA,   a  system  which  was 
a  prototype  for  the  American  Bankers  Association  promotion  of  industry-wide  adoption  of  MICR.     In 
this  recognition  system,    3/  the  characters  are  printed  with  an  ink  pigment  containing  iron  oxide,   and 
they  are  styled  so  that  the  various  characters  in  the  vocabulary  produce  dissimilar  waveforms  when 
scanned  by  the  machine.     Immediately  before  reading,   the  characters  are  magnetized  uniformly  by  a 
fixed-field  magnetic  write  head.     The  paper  carrier  (check)  on  which  the  characters  appear  is  then 
moved  past  a  magnetic  read  head.     This  read  head  is  similar  to  conventional  magnetic  tape  reading 
heads,   but  the  gap  length  is  several  times  the  height  of  the  character,   allowing  the  reader  to  tolerate 
a  degree  of  displacement  with  respect  to  registration  and  a  minor  amount  of  skew.     As  the  source 
pattern  moves  past  the  reading  head,   the  head  senses  the  area  of  magnetic  ink  under  it  at  a  given 
time  interval,   and  produces  an  input  pattern  which  is  an  instantaneous  output  voltage  proportional  to 
the  rate  of  change  of  the  magnetic  flux.     A  characteristic  voltage  waveform  is  thus  developed  as  a 
function  of  time.     The  waveform  is  next  fed  into  a  lumped-constant  delay  line  which  is  of  sufficient 
length  to  permit  storage  of  the  entire  character  waveform  derivative. 

Comparison  of  the  waveform  to  other  possible  waveform  patterns  is  made  from  a  number  of  tap 
points  on  the  delay  line  holding  the  total  input  pattern.     These  go  to  each  of  fourteen  different 
correlation  networks,   one  for  each  of  the  fourteen  numerals  and  symbols  in  the  vocabulary  of  this 
system.     The  delay  line  and  the  correlation  networks  (which  consist  of  a  number  of  resistance 
matrices)  form,    in  effect,   a  family  of  matched  filters  which  uses  the  full  analog  information  in  the 
waveform  for  decoding,    so  that  minor  deviations  from  the  expected  shape  are  permitted  when  making 
the  identification  decision.     A  timing  circuit  supplies  a  readout  gate  when  the  waveform  is 
completely  stored  in  the  delay  line.     Fourteen  differential  amplifiers  compare  the  outputs  of  the 
various  correlation  networks  at  the  instant  of  readout,    so  that  recognition  is  based  on  the  correct 
channel  having  a  greater  output  than  any  of  the  other  channels.     The  decision  is  thus  made  on  a  'best 
fit1  basis.     Upon  identification,   the  output  signals  are  coded  into  a  form  that  can  be  used  by  a 
computer,    sorter,   or  other  equipment  for  output  of  the  appropriate  target  pattern. 

Closely  similar  methods  may  be  employed  in  optical  scanning,   also,    with  suitable  means  for 
determining  the  relative  black,    rather  than  the  relative  magnetization,    in  successive  vertical  sub- 
areas  which  segment  or  'slice'  the  character.     In  both  magnetic  and  optical  uses  of  the  characteristic 
waveform  method,   however,   the  number  of  discrete  characters  that  can  be  discriminated  is  usually 
quite  limited,   and  closely  similar  characters  found  in  many  conventional  fonts  often  cannot  be 
distinguished  one  from  another. 

We  note  that  the  transformatipn  of  the  input  pattern  (derived  time-varying  signal  or  continuous 
waveform)  to  the  reference  pattern  format  (Operation  3  of  Figure   12,    p.  34),   usually  consists,   in  the 
characteristic  waveform  matching  method,   in  the  sampling  of  the  input  pattern  at  discrete  intervals. 
This  is  not  at  all  necessarily  a  series  of  'critical'  choices,   however,   as  in  most  instances  of  peep- 
hole-template matching.     Moreover,    the  number  of  sampling  points  that  are  technically  or 
economically  easy  to  use  in  such  a  system  may  markedly  affect  the  total  size  of  the  vocabulary  that 
can  be  handled.     This  is  particularly  the  case  where  a  waveform  derived  from  single-slit  scanning  is 
used  as  the  input  pattern.     Thus  Dickinson  states: 

"The  single-slit  scan  system  can  be  used  for  a  limited  number  of  characters  provided 
a  special  font  of  characters  is  used.     The  extension  of  this  system  to  reliably  read  alpha-w 
numeric  characters  of  normal  size  that  are  humanly  readable  does  not  appear  feasible.  "  — ' 
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See  discussion  by  Kirsch,    Ref.    49. 

Bomba,   J.S.     "Alpha -numeric  character  recognition  using  local  operations,  "    Ref.    54. 

Eldredge,   K.N.  ,    et  al.     "Automatic  input  for  business  data  processing  systems,  "    Ref.  109. 
See  also  Refs.    110,    111,    112,    124,   247. 

Dickinson,   W.  E.     "A  character-recognition  study,  "    Ref.    95. 
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Further  developments  in  the  use  of  characteristic  waveform  methods  therefore  have  included 
exploration  of  possibilities  for  combining  the  results  of  slit  scanning  in  several  different  directions. 
In  particular,   work  at  the  University  of  Birmingham,   England,   under  a  U.S.   Army  contract,   has 
included  experimentation  with  a  3 -way  system  providing  a  horizontal  scan  and  two  45°  scans,   together 
with  means  for  correlation  detection,    in  order  to  differentiate  the  characters  of  a  standard 
commercial  font.  _V 

4.  1.  5    Vector  Crossings 

In  each  of  the  four  previously  considered,    commonly  used  methods  of  automatic  character 
recognition  (template  matching,   peephole-template  matching,    coordinate  description  matching,   and 
characteristic  waveform  matching),   we  have  been  dealing  with  situations  in  which  it  is  tacitly  assumed 
that  all  the  possible  source  patterns  which  can  be  'recognized'  in  a  particular  system,   have,   at  least 
as  prototypes,   been  'seen'  before.     Recognizable  patterns  can  be  found,   by  suitable  area-preserving 
and  shape-preserving  transformations,   to  be  equivalent  to  previously  seen,   classified,   and  identifiable, 
reference  patterns. 

This  assumption,   of  course,    implies  that  all  recognizable  variations-of-a-single-character  for 
a  given  system  have   previously  been  analyzed  as  to  significant,    recurrent  characteristics  or 
properties  in  the  process  of  establishing  the  reference  patterns  and  the  identification  formulas.     In 
some  systems,    provisions  have  been  made  for  variable  identification  thresholds,    such  that,   for 
example,   an  abnormally  thick  or  an  abnormally  thin  "I"  would  still  be  recognized  as  "I".     In  an 
operational  system  such  as  the  Solartron  ERA  machine  for  tally-roll  accounting  data  input  for  chain- 
store  accounts  data  processing  applications,    some  latitude  in  possible  vertical  displacement  is 
provided.   zJ    But  such  gains  are  typically  achieved  at  the  price  of  duplicating  the  recognition  logic 
circuitry,    of  multiplying  the  reference  pattern  storage  requirements,  3/ or  both. 

Methods  which  try  to  develop  areas  of  optimum  discrimination  for  recognition-identification 
(especially  in  the  case  of  peephole-  templates,   or  in  weighted  coordinate  description  methods  using 
'gray'  or  'don't-care'  conditions)  also  assume  that  the  range  of  allowable   variations  in  vocabulary- 
character  configuration  is  known  in  advance.     That  is,   alternate  placements  and  minor  details  of 
shape  of  character-variants  to  be  recognized  as  a  single  character  in  the  output  of  the  recognition 
system  have  been  specified.     The  peephole-template  technique,   for  example,    if  properly  designed, 
will  tend  to  ignore  serifs  and  other  minor  typographic -stylistic  embellishments  as  well  a6  certain 
additive  or.  reductive  noise  (i.  e.  ,    smudging  or  mutilation  of  the  original  character  image). 
Nevertheless,   the  conditional  requirement  in  this  as  in  other  cases,    -  'properly  designed'  -,    implies 
exhaustive  advance  exploration  of  the  permissible  possibilities  of  source  pattern  variation. 

These  first  four  methods  for  automatic  character  recognition  are,    then,   primarily  closed-end 
vocabulary  systems.     They  are,   at  least  in  principle,   template -matching  systems  however  different 
the  reference  patterns  as  templates  may  be  from  the  appearance,   to  the  human  eye,   of  the  source 
patterns  that  are  machine-recognizable  in  the  system.     Under  template  matching  conditions,   of  what- 
ever kind,   there  is  implicit  a  one-to-one  correspondence  between  'input -pattern- space'  and 
'reference-pattern-space'  that  is  generally  conservative  of  both  area  and  shape  characteristics  of  the 
original  given  image. 

Moreover,   the  subsequent  mapping  from  the  specific  location  in  "reference-pattern-space", 
as  indicated  by  input  pattern  processing,   to  "target-pattern-space"  is  usually  such  as  to  provide  that 
a  given  "A"  on  the  printed,   typed,   or  handwritten  page  will  be  consistently  correlated  with  an 
appropriately  equivalent  code  or  symbol  output  form.     This  equivalent  target  pattern  will  represent 
the  character  'A'  if  and  only  if  the  source  pattern  significantly  coincides  'with  a  reference -character 
version  of  the  physically- similar  configuration  of  "A"  as  it  might  be  seen  by  the  human  eye.     In  other 
words,   a  set  of  "A"  patterns  must  have  been  repetitively  observed  and  classified  as  'A'   in  a  sense 
that  preserves  certain  of  the  area  and  shape  characteristics  common  to  the  set,   before  an  "A"- 
template,   leading  to  an  'A'  identification,     canJae  utilized.     This  will  generally  follow,   notwithstand- 
ing the  fact  that  other  templates,   for  sets  ("A") 'and  ("A")",   etc.  ,   may  also  be  correlated  with  a 
suitable  target  pattern  output  representing  'A'. 

In  many  cases,  however,   template-reference-pattern  vocabularies  are  not  a  feasible  solution 
in  some  character  recognition  systems  for  the  simple  reason  that  the  possible  variations  of  source 
pattern  corresponding  to  some  specific  desired  target  pattern  output  are  not  well  enough  known  in 
advance.     Such  is  obviously  the  case  ■with  handprinted  or  handwritten  characters.     In  these  situations, 
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Birmingham  University,    "Character  recognition",   Ref.    47. 

See  Refs.   25,    30,   84,    116,    117,    118,   405,    435,   436,   437. 

To  at  least  the  possibilities  for  constructing  the  ir  k  2  requirements  for  coordinate  description 
variations  for  a  given  reference  pattern.     See  Fain,    V.  S.  ,   Ref.    128. 
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effective,   precise  templates  are  either  not  known,    or,    even  if  they  were  available,    would  impose 
prohibitive  reference  pattern  storage  requirements.     Thus  there  has  been,    with  respect  to  possibilities 
for  automatic  character  recognition  in  such  situations,    either  the  imposition  of  constraints  on  possible 
variations  of  size  or  shape  or  centering,   or  an  attempt  to  utilize  relatively  invariant  discriminating 
characteristics,   or  both. 

The  method  that  we  have  termed  "vector  crossing  matching"  (Figure   12e,   p.    34)  has  been  used 
in  both  types  of  proposed  solution.     A  dramatic  example  of  the  use  of  this  method  in  close  conjunction 
with  suitable  constraints  on  the  formation  and  placement  of  handwritten  numeric  digits  was 
demonstrated  at  the  Eastern  Joint  Computer  Conference  held  in  Washington  in  1957,    1/and  we  should 
note  that  a  similar  technique  had  previously  been  disclosed  in  the  patent  literature.    2y    Dimond  of  the 
Bell  Telephone  Laboratories  reported  at  this   1957  EJCC  Conference  on  a  successful  method  for  the 
automatic  reading -recognition  of  numeric  digits  handwritten  by  telephone  toll  switchboard  operators 
in  the  Bell  System  who  produce  in  the  aggregate  some  two  billion  toll  tickets,   with  20  to  30 
characters  each,   per  year.     This  automatic  reading  may  be  accomplished  by  either  direct,   by-product, 
machine -language  data  generation  through  use  of  a  stylus  in  combination  with  a  special  recording  de- 
vice,   or  by  subsequent  machine  recognition  of  characters  recorded  on  paper  in  accordance  with 
preprinted  guides. 

The  stylus -recording-inscription  device  is  simple  in  operational  principle,   easy  to  use, 
portable,   and  presumably  inexpensive --all  very  desirable  features  in  terms  of  the  proposed 
application.     This  device  has  been  tentatively  termed  a  "Stylator",   and  its  basic  principles  are  as 
follows: 

"A  writing  surface  is  provided  on  which  there  are  two  guide  dots  surrounded  by  a  set 
of  criterial  areas  consisting  of  seven  conductors  embedded  in  a  plastic  plate.     As  a  numeral 
is    written  with  a  stylus  connected  to  a  source  of  potential,   the  stylus  energizes,   one  at  a 
time,   the  conductors  in  the  criterial  areas  involved  in  the  numeral.     The  combination  of 
areas  energized  causes  certain  flip-flops  in  a  translator  to  operate  and  drive  the  rest  of  the 
translator  to  indicate  the  correct  numeral.  "  3/ 

In  other  words,   the  conductors  marking  the  significant  areas  serve  as  vectors  which,    if 
crossed  by  the  stylus  as  the  numeric   digit  is  written,    will,    in  accordance  with  specific  vector- 
crossing  patterns,    serve  to  identify  the  digit  that  was  written.     In  the  Bell  Stylator  device,   the 
sequence  in  which  the  stylus  crosses  various  vectors  as  the  characters  are  produced  is  important 
for  recognition  purposes. 

An  independent  invention  was  disclosed  in  the  Johnson  patent,   assigned  to  IBM,    which  provides 
for  the  use  of  two  centering  dots  and  of  radial  areas  extending  from  these  dots  for  the  sensing  of 
conduction  of  the  marks  or  crossings  constituting  the  numeral  written.    */   Figure   16  is  a  reproduction 
of  two  cards  processed  in  an  experimental  reader,    based  on  this  principle,    demonstrated  at  the 
Western  Joint  Computer  Conference  of  1961.     Figure   16(a)  represents  source  patterns  correctly  read 
and  recognized.     In  Figure   16  (b)  however,   there  is  a  nonre cognition  of  the  fourth  handwritten  digit, 
because  that  digit,    "6",   was  not  properly  formed  in  accordance  with  the  system  constraints. 
Provided  that  the  required  vector  crossing  pattern  is  not  violated,   however,    considerable  variation 
in  the  exact  shape  of  the  handdrawn  digits  can  be  tolerated. 

Similarly,    supposing  that  in  effect  suitable  chosen  vectors  are  superimposed  on  the  image  field 
in  which  a  previously  written,    typed,   or  printed  character  appears,    the  detection  of  the  intersections 
by  a  black  portion  of  the  character  with  any  of  these  vectors  provides  an  input  pattern  which  can  then  be 
matched  to  reference  patterns,    leading  to  a  discriminating  identification  formula.     Such  a  method  for 
character  recognition  shows  certain  similarities  to  the  peephole  (critical-discriminant-area) 
template  method,    but  it  is  much  more  tolerant  of  stylistic  idiosyncracies,    size  variations,   and  skew, 
except  where  the  peephole -template  can  be  readily  expanded,    contracted,   and  rotated  in  accordance 
with  preliminary  scan-detection  of  character -edge  boundary  limits. 

It  will  be  noted  that  in  both  the  IBM  and  Dimond  examples  the  vectors  are  designed  in  a  manner 
intended  to  optimize  discriminations  between  the  members  of  a  relatively  small  character  vocabulary, 
where  the  possible  shapes  and  sizes  of  actual  character-images  recognizable  as  a  particular  member 


— '  Dimond,    T.  L.     "Devices  for  reading  handwritten  characters,  "  Ref.    98. 
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— '  Johnson,    R.  B.     "Indicia-controlled  record  perforating  machine,  "    Ref.    240. 
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— '  Dimond,    T.  L.     "Devices  for  reading  handwritten  characters,  "    Ref.    98,   p.    236. 
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— '  Johnson,    R.  B.  ,    "Indicia-controlled  record  perforating  machine,  "    Ref.    240. 
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of  the  vocabulary  vary  within  certain  constraints,  such  as  the  requirement  that  the  handwritten 
numeral  be  centered  about  two  preprinted  dots.  Other  work  at  Bell  Laboratories  related  to  the 
Dimond  technique  has  included  the  use  of  stylus  movement,  vector  crossing,  principles  for  the 
identification  of  handwritten  names  of  the  decimal  digits,  'four1,    'nine1,   etc.  ]J 

In  some  cases,   a  few  vertical  and  a  few  horizontal  scan  lines,   in  a  'cross-hatched'  scan,    will 
identify  individual  characters  by  detection  of  crossings  or  intersections.     In  the  case  of  a  specially 
stylized  font,    such  as  Rabinow's  'cutless'  cut  font,   £.7   only  horizontal  or  only  vertical  crossings  may 
be  employed.     It  has  been  reported  that  at  the  Steklov  Institute  for  Mathematics,   in  Moscow, 
"Dubinsky  is  working  on  the  idea  of  drawing  a  series  of  parallel  lines  through  a  line  of  print  and  then 
attempting  to  identify  the  characters  by  examining  the  intersections  of  the  series  of  parallel  lines  with 
the  letters.  "    V 

4/ 
Marill  and  Green  —  '   have  used  a  variation  of  the  vector  crossing  scheme  to  illustrate  a 

proposed  model  of  pattern  recognition  as  applied  to  handwritten  characters.     In  this  scheme,   involv- 
ing in  effect  a  polar  scan,  JVwhere  the  vectors  dissect  the  image  field  at  45°  intervals,   they  measure 
the  distance  along  each  radial  vector  from  the  edge  of  the  field  until  the  first  character  portion 
crossing  is  detected.     Kelly  and  Singer  6/  also  report  means  for  obtaining  characteristics  of  curves, 
for  character  recognition  purposes,   by  measuring  distances  from  the  center  of  gravity  with  respect 
to  radial  vectors. 

Vectors  of  different  angles  with  respect  to  the  source  pattern  field  are    assumed  in  an 
invention  of  Oliver.  _J/  He  discloses  the  use  of  a  carrier  for  the  reference  pattern  elements  such  as 
a  disc,   drum,   or  belt  that  operates  cyclically  in  one  direction.     This  carrier  contains  slits  or 
apertures  that  are  arranged  at  various  angles  to  the  carrier's  direction  of  travel,    so  that  each  of  the 
slits  will  coincide  with  just  those  character  strokes  that  are  parallel  to  a  given  slit.     Thus  the  source 
pattern  as  it  is  projected  is  scanned    along  several  different  axes  of  the  image  field. 

In  other  character  recognition  systems  that  have  been  proposed,   the  vector  crossing  method  is 
applied  with  respect  to  a  larger  number  of  vectors  than  the  previously  discussed  systems  require. 
Moreover,   in  some  of  the  latter  systems,   the  vectors  are  regularly  oriented  and  displaced  with 
respect  to  one  another,    with  provision  that  the  relative  displacement  be  proportional  to  the  size  of  the 
character,    that  is,   by  the  provision  of  means  for  automatically  varying  the  size  of  the  scanning 
pattern  in  accordance  with  the  size  of  the  source  pattern  to  be  identified.  £./ 

Such  a  system  is  exemplified  in  the  "Rotating  Raster  Character  Recognition  System,  "  described 
by  Weeks  with  respect  to  work  at  the  IBM  San  Jose  Laboratories,    in  August  I960.     In  these 
experiments,   both  machine  printed  and  handwritten  numeric  characters  served  as  source  patterns 
which  were  scanned:     "In  television  fashion  with  a  raster  consisting  of  six  lines  uniformly  distributed 
over  the  digit  at  six  different  angles  uniformly  spaced  30  degrees  apart.  "  zJ     The  vectors  in  this  case 
are  the  36  scan  lines,    which  are  proportionally  spaced  by  preliminary  scanning.     There  is  then 


1/ 
2/ 

V 
V 

5/ 


6/ 
1/ 
i/ 

9/ 


See,   for  example,   Refs.    192,    193,    277,    539. 

I.  e.  ,   a  font  in  which,   by  deliberate  design,    only  a  few  characters  of  the  vocabulary  are 
sufficiently  ambiguous  in  the  system  to  require  cuts  for  discrimination. 

Ware,   W.  H.  ,    ed.     "Soviet  computer  technology,    1959,  "Ref.    524,    p.    100. 

Marill,    T.    and  D.    M.    Green.      "Statistical  recognition  functions  and  the  design  of  pattern 
recognizers,  "    Ref.    294. 

See  Bomba,    J.S.     "Alpha -numeric  character  recognition  using  local  operations,"    Ref.    54, 
for  a  distinction  between  the  polar  scan  vector  crossings  technique  and  a  feature  extraction 
process  also  utilizing  coincidence  with  a  radial  pattern,   but  requiring  that  all  or  nearly  all  the 
input  pattern  cells  along  the  radii  coincide  with  those  of  the  reference  pattern. 

Kelly,    P.M.,   and  J.    R.    Singer.     "Bio-computer  design,  "  Ref.    251. 

Oliver,    >V.  C.     "Scanning  device,  "    Ref.    344. 

Rohland,    W.S.     "Character  reader,  "    Ref.    390,   and  "Character  sensing  system,  "    Ref.    391. 

Weeks,    R.W.     "Rotating  raster  character  recognition  system,"    Ref.    528,    p.    2  (preprint). 
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processing  which  determines  the  first  scan  line  crossing  and  the  last  and  then  divides  the  intervening 
area  into  7  equal  groups  to  obtain  6  equally  spaced  areas  across  the  character  image.     The  number  of 
vector  crossings  or  intersections  per  scan  line  are  now  counted,    since  different  portions  of  the 
character  image  may  be  crossed  by  the  same  scan  line.     The  counts  are  compared  with  probability 
tables,   which  are  based  on  statistics  gathered  from  previously  scanned  and  processed  characters  and 
thereby  provide  the  identification  formulas. 

4.  1.  6    Criterial  Feature  Analysis 

(We  should  note  first,    before  describing  what  we  have  termed  the  "Criterial  Feature  Analysis" 
approach  to  problems  of  automatic  character  recognition,   that  the  word  "criterial"  may  pose  certain 
difficulties  for  the  reader.     It  is  not  strictly  a  coined,   but,    rather,   a  shorthand,   term.     As  a  word,    it 
does  not  appear  in  Webster's  (unabridged).  International  Dictionary.     Instead,   the  word  "criterional" 
appears,    with  precisely  the  denotation  and  connotation  we  intend  to  convey  by  our  shorter  term- 
namely,   that  that  which  is  'criterial'  is  that  which  is:      "Of,   or  pertaining  to  .  .  .   a  standard  of  judging; 
a  rule  or  test,   by  which  facts,   principles,    opinions,   and  conduct  are  tried  in  forming  a  correct 
judgment  respecting  them.  "  _V   In  other  words,   the  criterial  features  matching  approach  attempts  to 
isolate  those  characteristics  and  properties  of  a  source  pattern  that  are  essential  for  recognition.) 

The  criterial  features  approach  is  intended  to  determine  and  apply  the  "machine-idea  of  an 
'E'"  2/     in  the  identification  of  a  given  source  pattern  as  indeed  an  "E",   or  as  not  an  "E",   and  so  on 
for  the  other  characters  of  the  vocabulary.     Similarly,    in  early  work  in  automatic  pattern  recognition, 
Selfridge  and  Dinneen  emphasized  that  the  recognition  problem  is  one  of  classifying  possible 
configurations  of  input  data  into  equivalence  classes  such  that  many  different  configurations  belong 
to  the  same  class.     Selfridge  in  particular  defines  the  basic  problem  of  pattern  recognition  as  "the 
extraction  of  the  significant  features  from  a  background  of  irrelevant  detail.  "  ±1   In  simulations  of 
machine  recognition  of  handdrawn  characters  on  the  Lincoln  Laboratory  Memory  Test  Computer,   they 
recognized  the  embarrassingly  large  number  of  reference  patterns  necessary  for  the  template 
approach  to  variable  configurations  of  the  same  character.     They  therefore  specifically  suggested 
various  types  of  filtering  of  the  source  pattern  in  order  to  extract  significant  features,    such  as  edges 
and  corners. 

Character  stroke  analysis  is  an  obvious  example  of  the  criterial  feature  method,    where  an  "E" 
might  be  identified  on  the  basis  of  an  input  pattern  consisting  of  the  record  of  detection  of   "a  long 
vertical  stroke  followed  by  three  horizontal  strokes"  regardless  of  how  large  the  source  pattern  was, 
where  it  was  in  the  image  field,   whether  there  were  serifs,   and  the  like. 

Metzelaar  discusses  character  stroke  analysis,   in  part,   as  follows: 

"Several  researchers  have  found  that  as  far  as  printed  letters  are  concerned,   one  of 
the  simplest  methods  of  avoiding  the  complications  due  to  variations  of  location,    size,   and 
density,   is  to  recognize  letters  by  their  'character  strokes.  '    In  this  technique,   letters  are 
scanned,   for  instance,   by  a  scanner  and  photomultiplier.  .  .  The  scanner  has  a  beam  much 
narrower  than  the.  width  of  a  letter.  .  .  The  number  of  intercepts  of  the  light  beam,   the  length 
of  each  intercept,   and  the  relative  location  of  intercepts  are  used  as  three  basic  inputs  to 
the  recognition  logic.  "  4/ 

In  addition  to  stroke  analysis,    per  se,   the  criterial  features  analysis  may  involve  detecting  the 
occurrence  of  specific  connections  between  strokes,   a  specified  sequence  of  strokes  and  connections, 
or  of  specially  chosen  shape  segments.     Shepard's  Patent  No.  2,  663,  758,   on  "Apparatus  for  Reading" 
was  granted  under  date  of  December  22,    1953.2/      This  patent  discloses  a  basic  design  which  employs 
a  mechanical  rotating  scanning  disk  having  radial  slits  and  a  stationary  slit,   arranged  to  move  a  spot 


1/ 
2/ 


H 

1/ 

5/ 


Webster's  New  International  Dictionary  of  the  English  Language,  2nd  ed.  ,  unabridged,  p.  627. 
We  note,  moreover,  that  the  term  "criterial  areas"  is  specifically  used  by  Dimond  (Ref.  98)  in 
this     sense. 

A  paper  presented  at  the   15th  National  Conference  of  the  Association  for  Computing  Machinery, 
August  I960,   by  D.   S.   Himmeiman  and  J.  T.    Chu  of  RCA,    (Ref.    213),    reported  on  studies  both 
of  human  recognition  of  systematically  deteriorated  character  patterns  and  of  a  computer 
simulation  program  to  explore  optimum  discriminant  areas  and  other  criterial  features  of  a 
topological  nature.     They  reported  that,   for  their  program,   the  machine  idea  of  the  'E'  could 
be  further  simplified  to  the  formula:     "A  black  stroke  followed  by  three  dots  to  the  left,   and 
any  multiples  thereof.  " 

Selfridge,   O.  G.     "Pattern  recognition  and  modern  computers,  "  Ref.    410;  Dineen,    G.  P. 
"Programming  pattern  recognition,  "    Ref.    100. 

Metzelaar,    P.     "Mechanical  realization  of  pattern  recognition,  "  Ref.    305,   p.    7. 

Shepard,    D.  H.     "Apparatus  for  reading,"    Ref.    415. 

51 


of  light  across  the  printed  characters  from  top  to  bottom  as  the  print  moves  from  right  to  left.     A 
photocell  examines  the  reflected  light  and  generates  the  input  pattern. 

In  a  particular  embodiment,    which  was  built  and  demonstrated  by  IMR,   the  input  pattern  is  a 
series  of  pattern  elements  detected  by  means  of  a  specially  designed  scanning  disk.     This  latter  disk 
comprises  a  rotatable  mask,    divided  into  sectors,    each  of  which  sectors  has  a  fixed  arrangement  of 
specially  shaped  openings  through  which  light  may  or  may  not  be  sensed,    depending  upon  the  shape  of 
the  source  pattern  being  scanned.     Thus  when  any  one  sector  of  the  mask  is  superimposed  on  the 
image  field,    an  input  pattern  element  is  registered  for  that  sector  if  any  portions  of  the  image  being 
scanned  coincide  with  the  sector  openings  so  as  to  reduce  light  to  the  photocell.     All  sectors  of  the 
mask  are  passed    sequentially  over  the  image  field  a  certain  number  of  times  as  the  unknown  symbol 
to  be  read  is  moved  laterally  across  the  scanning  field.     The  input  pattern  thus  consists  of  the  series 
of  hit-no-hit  indications  for  each  sector  as  a  sub-set  of  input  pattern  elements  and  of  the  series  of 
hit  indications  per  sector  per  rotation  cycle  comprising  the  complete  input  pattern  element  set. 

The  scanning  disk  with  specially  placed  and  specially  shaped  apertures  in  different  sectors  is 
in  effect  a  plurality  of  mask  arrays,    which  select  criterial  segments  of  character  shape  such  as 
those  shown  in  Figure   13,    example  f.  ,    p.  35  .     Such  a  disk  thus  serves  as  a  master  reference  pattern. 
The  element-by-element  match  of  the  input  pattern  with  the  total  vocabulary  of  reference  patterns 
coincides  with  the  transformation  of  the  source  pattern  to  an  input  pattern  and  its  transformation  to 
the  format  of  the  reference  pattern.     The  latter  transformation  also  effects  a  considerable  reduction 
in  the  total  information  available  in  the  original  source  pattern,    by  selecting  for  matching  only  those 
portions  of  the  total  symbol  image  which  will  serve  to  differentiate  it  from  other  symbols  in  the 
vocabulary. 

Transformation  of  the  results  of  the  element-by-element  match  (hit  indications  from  the  various 
apertures),    which  in  this  case  are  identical  with  the  input  pattern,    consists  in  the  determination  of 
which  sectors  have  had  hits  and  in  what  order,    as  the  observed  identification  formula.     This  is  then 
compared  with  each  of  the  set  of  master  identification  formulas  to  achieve  recognition. 

The  system  described  above  is  thus  illustrative  of  early  reduction  to  practice  of  criterial  .  , 
features  matching  methods,   as  was,   for  stroke  analysis,    a  Laboratory  for  Electronics  reader.  — ' 
In  the  last  few  years,  a  great  variety  of  criterial  features  have  been  proposed  and  tested  for  use  in 
automatic  character  recognition.     This  is  true  for  character  and  pattern  recognition  research  and 
development  efforts  both  in  the  United  States  and  abroad.     For  example,    Grimsdale,    et  al,    2/  at  the 
University  of  Manchester  in  England  have  explored  the  use  of  criterial  features  with  respect  both  to 
areas  and  to  shape  characteristics  such  as  slope  of  a  character  line.     Recent  Russian  work  in 
character  recognition  has  also  included  this  approach,   as  in  a  system  proposed  by  A.  G.    Vitushkin 
which  determines  characteristic  feature  combinations  such  as  'long  vertical  line  to  the  right,  '  'three 
short  vertical  lines  to  the  left1,   and  the  like.   ±1 

Somewhat  similar  effects  may  also  be  achieved  in  a  variation  of  the  coordinate  description 
method  to  provide  a  special  form  of  criterial  analysis  in  which  certain  cells  (coordinate  positions)  of 
the  superimposed  grid  must  be  black  and  certain  cells  white,    but  the  majority,    for  any  given 
character  pattern,   may  be  'grey'  -  that  is,    in  an  indeterminate  or  'don't-care'  status.     In  statistical 
studies  made  by  Baran  and  Estrin,    for  example,   the  a  priori  probability  distribution  for  each  cell 
being  black,    white,   or  grey  is  determined  with  the  assistance  of  a  computer.     As  a  result,   they 
report,    "The  offspring  stencil  used  is  not  of  the  outline  of  the  character  being  recognized,   but  is 
distorted  to  emphasize  those  subareas  which  are  significant  in  differentiating  one  character  from  all 
others."  V 

In  this  sense,   a  weighted  area  scanning,    or  weighted  matrix  correlation  technique,   may  be 
considered  to  be  the  more  general  principle  subsuming  both  holistic  templates  and  reductive 
templates  such  as  the  peephole-templates  previously  described.     That  is,    the  weight  for  the  area 
exactly  covered  by  a  given  photographic  negative  pattern  or  stencil  in  holistic  systems  is  one,    and 
for  all  other  areas  zero,   whereas  in  peephole-templates  the  weighted  areas  are  precisely  the  test 
points  where  either  black  or  white  is  seen. 


— '  "Interim  report  of  the  check  reader,  "    Ref.    264,    see  also  Glauberman,    M.H.  ,    Refs.  163,  164, 

165. 

— '  Grimsdale,    R.  L.  ,    F.H.Sumner,    C.J.    Tunis,   and  T.    Kilburn.     "A  system  for  the  automatic 

recognition  of  patterns,  "    Ref.    185. 

3/ 

— '  Garmash,    V.A.     "Seminar  on  reading  machines,"    Ref.  156,    p.    11. 

4/ 

— '  Baran,    P.  ,   and  G.   Estrin.      "An  adaptive  character  reader,  "    Ref.  34,    p.    33. 
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Reductive  templates  of  the  peephole  type.have  been  derived,   for  the  most  part,   by  manual  trial- 
and-error  methods  or  on  an  intuitive  basis.  _'    In  fact,    it  was  not  until  programs  of  computer 
simulation  ■were  developed  that  the  weighted,    and  therefore  criterial,     coordinate  description  methods 
began  to  receive  widespread  attention.     In  the  Japanese  Electrotechnical  Laboratory  system,   for 
example,   the  criterial  black-white  coordinate  positions  were  arrived  at  after  extensive  computer 
analysis  of  character  samples.   _'     In  general,   however,    we  will  distinguish  between  template  systems, 
whether  holistic  or  reductive,   and  systems  that  seek  criterial  features  in  the  sense  of  properties  that 
are  relatively  invariant  under  transformations  of  exact  area  or  position  or  of  size. 

Since  criterial  feature  extraction  may  be  considered  as  a  case  of  property  filtering,    related 
basic  research  simulating  perception  by  living  organisms  is  also  of  potential  interest.     Loebner  ,    for 
example,   has  developed  a  recognition  device,    combining  optical  and  electronic  techniques,    specifically 
designed  to  simulate  the  property  filtering  that  occurs  in  the  eye  of  the  frog.    _'    We  shall  consider 
criterial  feature  extraction  and  analysis  in  more  detail  in  discussing  certain  characteristics  of 
representative  recognition  systems  in  a  later  section  of  this  report. 

4.  1.  7    Curve -following  Recognition  Techniques 

In  addition  to  the  various  commonly  used  methods  for  character  recognition  shown  in  Figure   13, 
one  additional  technique  that  has  frequently  been  suggested  should  be  discussed.     This  is  the  technique 
of  curve -following  or  contour-direction-change  tracing.     A  curve -tracing  device,    consisting  of  a  spot 
of  light  travelling  in  a  circle,    can  be  made  to  follow  the  direction  of  a  black- white  boundary  and  to 
generate  x  and  y  deflection  voltages  which  are  a  function  of  the  shape  of  the  continuous  outline  of  a 
character.     The  waveforms  can  subsequently  be  analyzed  in  terms  of  the  equivalent  reference  patterns 
for  the  vocabulary  characters.     Advantages  of  this  technique  would  include  tolerance  for  considerable 
variation  in  size  or  exact  shape  of  symbols  scanned.     Disadvantages,   however,    include  relative 
intolerance  of  broken  strokes  and  of  fill-in,    such  that  the  gap  between  different  strokes  might  be  in- 
sufficient for  tracing  reentrant  sections  of  the  character  edges. 

The  curve -following  method  may  be  applied  in  conjunction  with  either  a  template  principle  (the 
exact  shape  of  the  contour)  or  to  a  criterial  feature  principle  (the  number  and  kind  of  changes  of  line 
direction  along  the  contour).     In  fact,   Kamentsky  specifically  includes  "edge  tracing"  in  the  "Search- 
ing for  Features"  category  of  his  classification  of  various  recognition  methods.    4/   Sequences  of 
contour  change  similarly  provide  the  "edge  sequence"  criteria  for  certain  recognition  sets  as 
proposed  by  Unger.     Unger,   however,    notes  that  for  the  more  general  case  of  two-dimensional 
graphic  patterns,    including  badly  malformed  characters  as  well  as  abstract  geometric  shapes,   the 
edge-tracing  sequences  are  often  insufficient  for  discrimination  in  a  particular  vocabulary  set.     He 
suggests  that  considerations  of  relative  edge  position  are  often  necessary,    and  that,   as  in  an  example 
given  of  a  badly  distorted  "H",    questions  of  relative  proportion  may  also  be  involved.     He  further 
suggests  that:     "An  arbitrary  set  of  standards  may  be  chosen  for  each  target  set,    such  as  the  require- 
ment:    'The  width  of  the  wider  leg  must  not  exceed  twice  the  width  of  the  narrower  leg.  "'£/    Such 
an  approach  obviously  combines  curve -following  feature  extraction  with  other  types  of  criterial 
feature  analysis. 

The  method  of  curve  following  or  edge  tracing  strongly  suggests  analogy  with  certain  findings 
in  the  area  of  human  perception,    namely,   that  critical  points  for  human  recognition  of  shape  seem 
to  be  concentrated  along  contours  or  boundaries  of  sharp  color  contrast.  J5/    Hartline,    moreover,   has 
reported  an  experimental  confirmation  using  the  horseshoe  crab,   as  follows: 
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Compare  Gill,   A.     "Pattern  recognition,  "    Ref.    159,    p.    676. 

See  Iijima,    T.     "Basic  theory  of  pattern  recognition,  "    Ref.    225,   and  Wada,    H.     et  al.     "An 
electronic  reading  machine,"    Ref.    521. 

Loebner,    E.  E.      "Image  processing  and  functional  retina  synthesis,"    Ref.  276.     See  also 
"Neuron  information  theory  may  be  used  in  computers,  "     Ref.    329,    p.    4. 

Kamentsky,    L.A.     "Pattern  and  character  recognition  systems;  picture  processing  by  nets 
of  neuron-like  elements,  "  Ref.    244. 

Unger,   S.H.     "Pattern  detection  and  recognition,  "    Ref.  501,    p.    1741. 

See,   for  example,   Attneave,   F.   and  M.  D.  Arnoult.      "The  quantitative  study  of  shape  and 
pattern  perception,  "    Ref.    18.     Attneave,   F.     "The  relative  importance  of  parts  of  a  contour,  " 
Ref.    19,   and  "Some  informational  aspects  of  visual  perception,  "  Ref.    20;  Anderson,    Nancy  S. 
and  E.    T.   Klemmer.     "A  review  of  stimulus  variables  of  patterns,  "    Ref.    16. 
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"Inhibitory   interaction  in  the  retina  results  in  the  enhancement  of  brightness  contrast. 
Since  the 'inhibition  is  greatest  between  elements  that  are  close  together,    contrast  is  greatest 
at  borders  and  edges  in  the  retinal  image,   where  steep  gradients  of  intensity  exist.     Direct 
experimental  evidence  for  this  has  been  obtained  from  the  eye  of  Limulus.  "    }_/ 

The  technique  of  curve -following  as  a  means  for  automatic  character  recognition  was  proposed 
at  least  as  early  as   1953,   2/    and  several  organizations  active  in  the  field  have  continued  to  explore 
possibilities.     For  example,   the  Perkin-Elmer  Corporation  in  the  U.S.   and  the  Levy  Associates 
Company  in  Canada  have  been  investigating  reductive  transformations  of  curve -tracing  data  as  a 
means  for  recognition  of  specific  character  and  symbol  shapes.     Russian  experiments  have  been 
reported  combining  a  scanning  mechanism  with  a  computer  program  to  follow  the  source  pattern 
outline,    record  shifts  in  direction,   and,   by  computer  analysis  of  the  sequence  of  values  for  observed 
directions  and  comparison  with  previously  analyzed  reference  patterns,    accomplish  recognition.  ^J 
However,    it  does  not  appear  that  operational  results  are  as  yet  available  nor  that  a  practical  reader 
using  this  method  is  as  yet  in  operation.  2./ 

Variations  on  curve -following  techniques  have  been  or  are  being  used,   however.     In  discussions 
of  a  paper  by  Sprick  and  Ganzhorn  at  the  International  Conference  on  Information  Processing,    (to  be 
discussed  below),    several  techniques  were  described.     Elkind  has  obtained  85%  accuracy  in 
experimental  recognition  of  handprinted  block  capitals  by  determining  slopes  of  character  lines, 
dividing  the  slopes  into  three  categories,   and  determining  the  number  of  incidences  for  each  category 
per  character.    SJ  Preliminary  work  at  the  Dahlgren  Proving  Ground  was  also  reported^/  in  which 
curve  following  is  employed,   but  the  input  pattern  elements  consist  of  indications  of  horizontal  and 
vertical  motion  and  of  transfers  from  one  mode  to  another.     A  variation  of  curve -following  that  is  non- 
reentrant  is  used  in  a  prototype  reader  for  handwritten  numerics  at  Rabinow  Engineering. 

Contour-tracing  in  a  sense  begins  as  a  direct  translation  or  pantographic  copying  process,   but 
the  copy- reproduction  motions  are  then  subjected,   as  both  MinotJ?/  and  Fairthorne  8/ have  suggested, 
to  an  operational  analysis  which  in  effect  constitutes  the  input  pattern.     Where  this  operational  analysis 
effects  the  elimination  of  redundancy  in  continuous -direction  line  traces  we  have,   as  previously  noted, 
a  criterial  features  input  pattern.     We  might  also  expect  to  find  possibilities  for  combination  with 
either  criterial  area  or  vector  crossing  principles.     Thus  if  the  pantographic  copying  (whether 
mechanical  or  derived  from  electronic  scanning)  can  be  suitably  deflected  into  a  given  fixed  reference 
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Hartline,   H.K.     "Receptor  activity  and  neural  interaction  in  the  eye.  "    (Abstract)  Ref.    198. 

Loeb,    J.     "Communication  theory  of  transmission  of  simple  drawings,  ''  Ref.    275,   and  Beurle, 
R.L.     "Letter-tracing  device,  "    Ref.    46. 

See  Garmash,   V.A.     "Seminar  on  reading  machines,  "    Ref.    156,   p.    11,    referring  to  work  of 
V.A.   Kovaleskiy. 

Haller,   who  conducted  a  master's  thesis  study  on  line  tracing  as  simulated  on  a  computer,   made 
the  following  evaluation:     "The  basic  conclusion  of  this  paper  is  that  character  recognition  by 
line  tracing  is  feasible  and  limited  only  by  the  complexity  of  the  recognition  tree  .  .  .   Attempts 
to  recognize  typewritten  characters,    which  are  exceedingly  rich  in  frills  leads  to  an  impossibly 
complex  tree.  "    Ref.    190,   p.    4. 

Ref.    439,    Discussion  by  J.  C.  R.    Licklider,    p.    244. 

Ref.    439,    Discussion  by  M.  S.    Maxwell,   p.    244. 

Thus  Minot  suggests  that  one  may  operationally  define  the  generation  of  all  figures  of  a  given 
class,    such  as,    for  circles,    a  complete  tracing  with  return  to  the  starting  point,    plus  constant 
curvature.     Suitable  detecting  and  classifying  procedures  would  then  be  applied  to  unknown  input 
characters  to  provide  a  basis  for  matching  with  the  previously  stored  class  definitions.     "This 
sort  of  process  may  be  called  operational  identification.  "    (Ref.    310,   p.    13.) 

Private  discussions  between  R.  A.    Fairthorne,   H.    Pfeffer,   and  M.  E.    Stevens  on  possibilities 
for  recognition  of  chemical  structure  diagrams,    September   1959- 
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area,    crossings  of  these  vectors  by  the  copy  motion,  might  be  used  for  recognition  purposes.     Suitable 
adjustments  for  size  and  tilt  normalization  might  also  be  employed. 

A  variant  on  the  curve -tracing,  contour -following  method  has  been  developed  by  Sprick  and 
Ganzhorn,  _l/in  which  contour  data  for  the  two  opposite  sides  of  the  character  given  by  the  source 
pattern  are  obtained.     Related  techniques  disclosed  by  Sprick.?/  include: 

(1)  A  prescanning  operation,   which  effectively  locates  the  center  of  a  character; 

(2)  The  generation  of  successive  angularly  displaced  radial  scan  sweeps  from  that 
effective  center  which  produce  signals  when  a  portion  of  the  character  is 
encountered; 

(3)  Means  to  process  these  signals  such  that  their  magnitude  is  a  function  of  the  distance 
from  the  center  to  the  intercepted  character  portion  in  each  of  the  radial  zones,   and 

3/ 

(4)  Means  to  produce  a  waveform  that  approximates  the  outline  of  the  character  scanned.  — ' 

In  the  two-side  contour  version  of  this  variant  method,   the  waveforms  are  differentiated  and 
passed  through  a  criteria  detecting  circuit  and  further  logic  for  comparison  with  reference  patterns. 
Young,   in  evaluating  this  Sprick- Ganzhorn  technique,   notes  that  in  experiments  where  only  seven 
criteria  combinations  were  used  a  number  of  different  shapes  were  successfully  recognized,   but  that: 

"Considerable  difficulties  have,   however,   been  caused  by  poor  quality  print,    smudges, 
and  specks,   and  these  are  particularly  important  to  solve  when  using  a  fast  contour -following 
technique.  "  2/ 

Thus  we  again  find  that  such  factors  as  quality  of  input,    limitations  of  vocabulary,   and  the  like, 
are  of  more  significance  than  specific  differences  of  technique  in  some  of  the  commonly  used  or 
proposed  methods  of  automatic  character  recognition. 

4.  2    Process  Steps  in  Character  Recognition 

The  common  methods  for  character  recognition  discussed  above  have  illustrated  various 
different  combinations  of  means  to  carry  out  process  steps  such  as  are  shown  in  the  generalized 
diagram  of  Figure   12,   p.    34.     Let  us  now  consider  the  steps  of  a  generalized  recognition  process 
in  more  detail.     We  note,   first,   however,   that  many  of  the  transformations  shown  in  Figure   12  can 
be,   for  a  specific  case,    identity  transformations  or  null  operations.     In  the  case  that  operation  7  of 
Figure   12  is  an  identity  transformation,   for  example,    operations  8,    9,    and  10  would  coincide  with 
operations  4,    5,   and  6,    respectively.     In  the  case  that  operations  4  or  8,   or  both,   are  instrumented 
in  a  particular  system  by  the  simultaneous  selection  of  all  reference  patterns,   operations  6  and  10 
would  not  be  activated  and  operations  5  and  6  would  represent  multiple  comparisons  being  made  in 
parallel. 

4.  2.  1    Input  and  Transformations  of  a  Source  Pattern 

Operation  1  in  Figure   12,   the  input  of  a  source  pattern,    represents  the  impingement  of  the  data 
of  the  source  pattern  upon  suitable  receptor  organs.     It  may  be  more  simply  referred  to  as  "pick- 
up".    This  operation  is  typically  under  the  control  of  the  scanning  threshold,    which  is  subject  to  such 
weighting  functions  as  scanner  resolution,   adjustments  of  scanner    sensitivity  to  contrast,   and  the 
like.     For  example,    in  an  application  of  an  automatic  optical  reader  for  the  processing  of  traveller's 
checks,  jy  the  scanning  mechanism  includes  a  means  for  'gloss-sensing'  so  that  the  source  pattern 
pick-up  of  the  serial  number  digits  which  have  high  gloss  characteristics  can  be  accomplished 
despite  interference  and  noise  from  over- stamping. 

This  source  pattern  input  operation  may  include  positioning  and  repositioning  of  the  source 
material  for  effective  pick-up,    including  movements  of  the  source  pattern  carrier,   the  scanning 
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Sprick,   W.   and  K.    Ganzhorn.     "An  analogous  method  for  pattern  recognition  by  following  the 
boundary.  "    Ref.    439. 

Sprick,   W.     "Character  reader,  "    Ref.    441. 

As  previously  noted,   a  scheme  proposed  by  Marill  and  Green,   although  more  properly  a 
vector  crossings  method,   provides  the  converse  of  Sprick's  method,   by  measuring  the 
distance  from  the  boundaries  of  the  image  field  until  first  character  encounter  for  each 
scan-line,   without  requiring  that  the  source  pattern  first  be  centered.     See  Ref.    294. 

Young,    D.A.     "Automatic  character  recognition,  "    Ref.    539. 

Installation  of  a  Farrington-IMR  reader  at  the  First  National  City  Bank,   New  York,    see 
Keller,   A.  E.  ,    Ref.    249,   p.    25. 
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mechanism,   or  both.     In  some  automatic  character  recognition  systems,   when  only  white  areas  are 
sensed  at  a  given  input  time,    operation  1    is  automatically  repeated.     Both  gross  positioning  and 
micropositioning  operations  are  carried  out  as  required  in  the  source  pattern  input  step. 

In  some  systems,   there  may  be  a  gross  (low  resolution)  prior  scan  to  locate  fiduciary  marks 
("begin  read"  signals),   or  any  black  area  first  encountered  on  a  page  (or  in  the  expected  image  field), 
or  the  first  black  encountered  coming  up  from  the  bottom  edge  of  an  envelope.   _V  Other  line-finding 
and  line -following  operations  to  adjust  for  tilt  of  a  line  of  characters  which  will  become,    successively, 
a  sequence  of  source  patterns,   may  also  be  involved.     Examples  include,   for  gross  positioning,   an 
experimental  Farrington-IMR  machine  for  post  office  mail-sorting  applications,   and  for  micro- 
positioning,   techniques  such  as  those  disclosed  by  Bozeman.  _ff/ 

In  an  early  Post  Office  address-reader  designed  by  Farrington-IMR,   the  input  process  consists 
first  of  finding  the  lowest  line  of  the  typed  address  on  an  envelope,    with  scan  of  each  successive  one 
of  four  areas   1   1/2  inches  wide  to  determine  where  this  last  line  is  located,   and  with  subsequent  scan- 
ning such  as  to  follow  the  apparent  lowest  line.     The  shadow  cast  by  the  window  of  a  window  envelope 
has,   for  example,    caused  difficulties  in  adjustment  for  this  lowest  line  position.     It  is  noteworthy  that 
at  least  one  organization  active  in  the  development  of  automatic  character  reading  techniques  has 
estimated  that  a  large  percentage  of  the  cost  of  a  reader-system  is  in  precisely  such  housekeeping 
operations  as  line  finding  and  servoing  the  scanning  means  to  follow  a  skewed  or  tilted  line. 

In  the  Bozeman  patent,   assigned  to  Burroughs,   the  technique  for  micropositioning  (i.e.  ,   of  the 
individual  source  pattern)  is  described  as  follows: 

"One  of  the  features  of  the  invention  is  ths  utilization  of  one  or  more  cathode  ray  tubes 
each  having  sweep  circuits  that  are  so  synchronized  with  the  scanning  of  the  character  that 
each  vertical  sweep  of  the  cathode  ray  tube  starts  substantially  at  the  instant  when  the 
scanning  spot  encounteres  the  starting  edge  of  the  character  in  the  corresponding  vertical 
scan,   and  each  horizontal  sweep  of  the  cathode  ray  tube  starts  concurrently  with  the  begin- 
ning of  the  first  vertical  sweep  (that  is,    with  the  detection  of  the  starting  point).     This  causes 
the  cathode  ray  tube  to  project  upon  its  screen  a  transformed  image  of  the  character  where- 
in the  edge  of  said  character  that  corresponds  to  the  starting  edge  of  the  original  character 
is  brought  into  coincidence  with  a  fixed  reference  line  on  said  screen,   and  the  point  on  said 
image  corresponding  to  the  starting  point  of  the  original  character  is  located  at  one  extremity 
of  said  reference  line.  "    3/ 

The  actual  input  of  a  source  pattern  is  usually  either  followed  or  accompanied  by  operations  2 
and  3  of  Figure   12.  _^/    Operation  2  typically  involves  the  transformation  of  the  source  pattern  that 
has  been  picked  up  to  an  appropriate  input  pattern  suitable  for  further  processing  in  a  particular 
recognition  system.     For  example,    the  interruption  of  reflected  light  to  a  photocell  as  a  scan  line 
crosses  black  areas  in  the  image  field  is  translated  into  a  continuous  waveform  or  into  pulse  patterns. 
An  example  of  an  identity  transformation  in  this  operation  is  the  optical  reflection  of  the  source 
pattern  image  from  the  original  image  field. 

Feedback  functions  may  affect  this  operation  as  in  cases  where  the  first  input  pattern  is  used 
to  adjust  repositioning  of  the  source  image,    to  trigger  the  beginning  of  specific  recognition  steps,   to 
standardize  the  dimensions  of  succeeding  input  pattern  elements  or  to  set  the  frame  of  reference  for 
more  detailed  analysis  of  the  input  pattern.     Feedback  functions  affecting  operation  2  may  also  arise 
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Fiduciary  marks  are  used,    for  example,    in  FOSDIC  equipment  and  in  a  Baird-Atomic  page 
reader.     The  possibilities  with  respect  to  system  design  will  be  discussed  in  more  detail  in 
connection  with  consideration  of  operational  requirements. 

Bozeman,    J.W.     "Character  recognition  apparatus ,  "     Ref.    57. 

Bozeman,   J.W.     "Character  recognition  apparatus,  "  Ref.    57. 

Compare,   for  example,    Kharkevich:     "The  preparation  process  may  include  making  the 
lines  finer,    smoothing  the  contours,    eliminating  small  defects  in  the  print,    eliminating 
small  spots  of  extraneous  origin,    increasing  the  contrast,    etc.     The  preparation  may 
involve  a  separate  operation  which  follows  inspection  using  a  conventional  scanning 
system;  however,    it  may  also  be  completely  or  partially  included  in  the  inspection  process.  ' 
(Ref.   256,   p.    21.) 
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where  any  input  pattern,  or  input  pattern  element,  determines  the  selection  of  succeeding  elements 
of  the  source  pattern.  }_/  In  reader  developments  at  Philco,  several  different  scan  modes  and  tech- 
niques for  focussing  and  defocussing  serve  to  enlarge  or  reduce  the  area  of  pick-up. 

In  some  cases,    such  as  the  German  reader  for  typewritten  numerals  previously  noted,    operation 
2  involves,    in  effect,   transformation  of  the  source  pattern  into  a  series  of  input  patterns,    systemati- 
cally displaced  by  shift  register  techniques,    to  accommodate  variations  in  both  horizontal  and  vertical 
registration,   as  well  as  to  provide  a  number  of  different  input  images  in  the  coordinate  description 
manner.     A  major  type  of  operation  frequently  involved  in  the  scanning  pick-up  of  the  source  pattern 
and  its  translation  to  an  appropriate  input  pattern,    required  in  the  coordinate  description  methods,   and 
probably  found  in  the  human  retina  itself,   is  therefore  that  of  "quantization".     That  is,   for  a  given 
mesh  of  theoretically  superimposed  grid,    or  for  the  cells  of  a  retinal  mosaic,   a  decision  is  to  be 
made  as  to  whether  or  not  a  given  cell  (or  coordinate  position)  "sees"  black,   white,   or  some  inter- 
mediate color  value  at  a  given  instant  of  reception  of  sensory  data.     Thus  we  may  assume,   with 
Stearns,    that: 

"1)    The  pattern  to  be  recognized  always  may  be  made  to  occupy  a  fixed,   finite, 
rectangular  area  on  a  plane. 

"2)    This  area  may  be  divided  into    n    elements. 

2/ 
"3)    Each  element  may  be  designated  either  black  or  white.  "  — ' 

Such  binary  quantization  is  typically  achieved  in  one  of  two  ways:     either  by  integration  over  the 
source  pattern  sub-area  'sensed'  by  a  particular  cell  and  a  comparison  of  the  result  with  some  given 
threshold  value  which  will  govern  the  decision  as  to  'black'  or  'white',    or  by  a  local  averaging 
operation  in  which  a  presumed  'black'cell  is  given  an  evaluated  'black'  or  'white'  value  in  accordance 
with  a  criterion  based  upon  the  relative  'blackness'  or  'whiteness'  of  its  immediately  adjacent 
neighbor  cells.     It  should  be  noted,    of  course,   that  such  quantization  need  not  be  limited  to  the  binary 
case.     In  fact,    in  a  Rabinow  Engineering  reader  design,    four  levels  of  quantization- -'black',    'medium 
gray',    'light  gray',   and  'white'  --  are  utilized.   zJ 

4/ 
Taylor,  — '  however,    makes  considerable  point  of  preserving  differential  values  for  relative 

black  or  white  as  actually  sensed  in  each  cell  of  the  scanning  grid  until  a  later  stage  in  the  recognition 
process,    so  that  in  his  system  the  input  pattern  is  claimed  to  be  an  accurate  'analog'  of  the  original 
image  pattern.     He  thus  claims,   as  of  1958,   that,    "The  new  apparatus  differs  fundamentally  from  any 
general-purpose  pattern-recognition  apparatus  that  has  been  proposed  in  the  past,    since  the  analogue 
form  of  the  signals  is  preserved  right  up  to  the  output  indicating  devices.  "    This  difference  would 
seem  to  be  one  of  sequence  of  processing  steps  rather  than  of  basic  recognition  principle  since,   on 
the  one  hand,   a  distinction  between  'long1,    'medium',   or  'short'  black  strokes,   as  in  a  Laboratory 
for  Electronics  reader _f/  would  also  conserve  a  certain  amount  of  'analog'  data  per  scan  segment, 
and  since,   on  the  other  hand,   the  mesh  (e.g.  ,    9x9)  with  which  Taylor  has  worked  is  of  considerably 
coarser  grain  (or  resolution)  than  some  coordinate  description  input  patterns  to  which  integration- 
threshold  or  local  averaging  techniques,    or  both,   are  applied.     Also,   of  course,    criterial  area  or 
'optimum  discriminant  area'  weightings,    as  noted,   among  others,   by  Wada,  _'  as  disclosed  in  a 
recent  patent  issued  to  Highleyman  JJ    and  as  'learned'  by  a  reward-punishment  procedure  in  certain 
of  the  Perceptron  systems,   achieve  fundamentally  the  same  effect. 


1/ 

i/ 

3/ 

V 
1/ 


The  early  Laboratory  for  Electronics  reader  discussed  in  Ref.    264,    for  example,    is  one  in 
which  only  those  source  patterns  elements  which  differ  from  preceding  elements  are  used  as 
elements  of  the  input  pattern. 

Stearns,   S.  D.     "A  method  for  the  design  of  pattern  recognition  logic,  "    Ref.    451,   p.    48,    see 
also  McLachlan,    D.  ,    Jr.     "Description  mechanics,  "    Ref.    286. 

Rabinow  Engineering  Company.  "Character  recognition  machines --principles  of  operations,  " 
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It  should  be  noted  that  the  identical- copy  optical  template  techniques  do  not  depend  upon 
quantization  for  derivation  of  the  input  pattern.     In  many  of  the  characteristic  waveform  matching 
methods,    the  effect  of  quantization  is  achieved  indirectly,   by  the  amplitude  differences  reflecting  the 
relative  density  of  black  (or  of  magnetization)  in  a  scan  segment  or  sub-area,   for  example. 

We  should  note,    moreover,    that  even  when  quantization  is  not  done  intentionally,    it  happens  as 
a  result  of  the  finite  frequency    bandwidth  of  electrical  circuits.     Whenever  this  bandwidth  is  not 
several  times  as  great  as  the  highest  frequency  component    required  for  "perfect  facsimile" 
reproduction  of  the  scanned  character,    there  will  be  some  smoothing  or  quantizing  (i.  e.  ,    the  number 
of  measurably  different  amplitudes  will  be  limited). 

A  special  case  of  quantization  in  connection  with  a  criterial  type  of  line -tracing  has  apparently 
been  explored  in  the  work  of  the  Russian  scientist,    Glushkov.     The  method  is  reported  to  involve  the 
scanning  of  a  square  sub-area  in  order  to  determine  its  average  blackness,   and  then  to  proceed  as 
follows: 

"The  beam  would  be  programmed  to  scan  adjacent  squares  in  order  to  determine  the 
direction  of  maximum  darkness  and  to  move  in  that  direction.     The  associated  logical 
equipment  would  store  this  direction  and  control  the  motion  of  the  beam  accordingly.     The 
direction  and  curvature  of  motion  would  then  be  computed  using  first  and  second  order 
differences.     A  large  change  of  direction  would  indicate  that  such  an  inflection  point  should 
be  recorded.     If  necessary,   a  coarse  measure  of  length,    such  as  short,    average,    or  long, 
will  also  be  computed  for  each  line  segment.     It  is  hoped  that  from  such  a  scanning  scheme, 
a  set  of  invariant  characteristics  can  be  determined  for  each  character.  "    _V 

Operation  2,   for  the  transformation  of  the  source  pattern  to  the  required  input  pattern,   may 
also  include  various  steps  of  image  improvement,    2/   such  as  contrast  enhancement  or  integration  of 
the  relative  black  or  the  relative  white  for  the  image  sub-area  scanned  in  a  given  interval  of  the  scan 
cycle.     A  variety  of  "cleanup",    "defuzzing",    3/and  skeletonizing  or  "inlining"  operations  may  be 
included  in  the  operations  of  sensing  the  source  pattern  and  its  transformation  to  the  input  pattern. 
These  may  be  applied  overall,   as  in  the  removal  of  "salt"  (small  white  spots  or  holes  in  areas  where 
the  character  is  generally  black)  and  "pepper"  (small  black  spots  in  generally  white  area),  zJ or 
special  "local  operations"  may  be  performed  on  successive  small  sub-areas.     In  the  Solartron  ERA 
machines,   blurred  edges  are:     "Considerably  cleaned  up  by  applying  pulse-width  discrimination  to 
the  line  scan.  "  5/ 

In  some  cases,   the  skeletonizing  or  cartooning  operations  found  in  input  pattern  transformations 
involve  the  erasing  of  all  of  a  character  stroke  except  its  outermost  edges,   as  in  the  edging 
operations  reported  by  Dinneen  in  1955,    6/and  in  the  'custering1  operation  developed  at  the  National 
Bureau  of  Standards  for  picture-processing  experiments  on  SEAC.  3.1     Other  techniques,   however, 
are  directed  toward  normalizing  the  character  stroke  to  a  predetermined  average  width  or  thickness, 
as  in  Bomba's  system,  JV  or  to  an  otherwise  idealized  line.     Uhr  says  of  his  system,   for  example, 
that: 


"Figures  are  sharpened  by  a  single  process  that  does  something  much  like  drawing  an 
average  or  essence-giving  line  through  the  figure.     This  process  makes  use  of  a  Gestalt- 
theory  concept:    a  force  field  toward  closure  and  toward  the  'good1  figure.  "  9/ 


To  an  extent,   as  we  have  previously  noted,    shift  register  translation  processing  in  the  derivation  of 
the  input  pattern  from  the  original  source  image  scanned  is  a  cleanup  operation  for  fuzzy  edges  and 
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smudging  of  characters,   at  least  with  respect  to  the  leading  edge.     However,    noise,   dirt,   and 
smudging  in  interior  portions  of  the  character  and  mutilations  of  the  character  stroke  may  also 
occur.  }_/  In  other  cases,   therefore,    special  operations  are  performed,   including  the  'local  averaging 
operations'  previously  mentioned  in  connection  with  source-to-input-pattern  quantization  for 
coordinate  description  and  other  recognition  methods. 

These  local  averaging  and  improvement  operations  may  actually  occur  directly  as  the  result  of 
variable  settings  of  the  scanning  threshold  controlling  operation  1  of  Figure  12,    (p.    34)    but  more 
typically  they  are  carried  out  in  the  process  of  integration  and  cleanup  in  the  conversion  of  the  source 
pattern  to  the  input  pattern,   or  in  the  transformation  of  that  pattern  to  the  required  reference  pattern 
format.     In  a  few  cases,   however,   the  practical  effects  of  local  averaging  and  improvement  operations 
may  be  achieved  at  later  stages  of  the  recognition-processing  cycle;  for  example,    during  the  matching 
process  for  certain  of  the  photographic  template  systems,   or  by  the  effects  of  variable  weightings 
directly  related  to  the  original  intensity  of  a  specific  sub-area  of  the  source  pattern  in  the  matching 
of  the  identification  formulas,  fj 

We  have  previously  noted  the  very  large  number  of  possible  coordinate  descriptions  of  any 
single  character  pattern  subject  to  the  transformations  of  size,    registration,   and  rotation.     When  we 
take  account,   in  addition,   of  the  problem  of  handling  the  randomly  superimposed  noise,    --  dirt, 
smudging, .  bleeding  and  filling  of  characters,   matrix  hairline  or  ribbon  fiber  impressions, 
deterioration  or  mutilation  including  complete  breaks  in  character  strokes,   and  the  like--the  number 
of  potential  reference  patterns  is  even  further  multiplied.     For  this  reason,   the  majority  of 
researchers  who  have  considered  this  problem  are  strongly  of  the  opinion  that  the  cleanup  operations 
should  be  placed  as  near  the  beginning  of  the  sensing-recognition  cycle  as  possible.  ±1 

Still  another  commonly  encountered  category  of  source-to-input-pattern  transformations 
includes  various  procedures  and  techniques  for  the  derivation  of  an  input  pattern  from  the  source 
pattern  by  "optical  filtering",    "photometric  analysis",    "autocorrelation  techniques",   and  "field 
potential  analysis".     The  "lenticular  lens"  array  of  a  system  designed  by  Briggs  Associates,   Inc.    is 
another  example  of  a  specialized  transformation  carried  out  as  an  operation  2  or  an  operation  3 
process  step.     The  lenticular  array  is  described  as  consisting  of  two  sets  of  minute   hemispherical 
ribs  positioned  at  right  angles  to  each  other  where  each  lens  of  the  array  serves  as  a  minute, 
independent  spherical  lens  to  direct  the  light  from  a  parallel  light  source  to  discrete  positions  on  a 
viewing  plane.     The  lens  array  can  be  tilted  to  provide  various  viewing  angles.     For  example,    36 
distinct  patterns  can  be  discriminated  in  a  6x6  directional  matrix  master  pattern.    4/  "We  shall  defer 
consideration  of  some  of  the  other  special  transformations  to  a  later  section  discussing  some  of  the 
design  characteristics  of  selected  systems. 

Of  course,   the  continuous -time  waveform  produced  by  single-slit  scanning  in  certain  of  the 
characteristic  waveform  techniques  for  character  recognition,    the  binary  encoding  of  coordinate 
description  hits,   and  the  hit-or-no-hit  patterns  for  criterial  features  analysis  or  for  vector  crossings, 
are  also  input  pattern  forms  that  are  derivatives  of  the  original  source  pattern. 

Where  operation  3  of  Figure   12  is  other  than  an  identity  transformation  on  the  input  pattern,    it 
involves  a  translation  of  the  input  pattern  into  the  repertory  of  pattern  elements  used  as  the  basis 
for  identification  in  the  system.     If  typically  consists  of  such  steps  as  unitizing  (or  quantizing)  of  the 
input  pattern,   breaking  up  of  a  continuous  waveform  into,  discrete  signals,    substituting  counts  of  the 
number  of  pulses  for  the  pulses  themselves,    determining  the  relative  length  of  pulses  or  black  blobs 
sensed  and  substituting  symbols  representative  of  these  characteristics  for  the  characteristics  them- 
selves,  and  other  encoding  operations.     This  operation  may  also  involve  the  selection  from  the  total 
input  pattern  of  those  elements  and  only  those  elements  which  are  significant  for  identification,    as  in 
several  of  the  criterial  area  aperture  techniques. 
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See,   for  example,   the  breaks  in  the  typewritten  "L"  of  Figure   19,   p.  79   • 

For  example,   in  a  technique  proposed  by  Hau  Wah  Lo:    "The  main  scheme  is  to  'feed'  into  the 
program  a  sufficient  number  of  arbitrarily  written  figures  of  each  alphabet,    record  the 
positions  of  the  points  covered  by  the  figure,   and  them  by  multiplying  by  a  constant 

term.     Thus,   the  positions  on  the  screen  'most  frequently'  crossed  by  each  figure  are  in 
storage."    Ref.   274,   p.    63. 

See  Refs.    244,   276,    305,    500,    501. 

Brown,    L.R.     "Nons canning  character  reader  uses  coded  wafer,  "    Ref.    62. 
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In  a  special  case  of  peephole  template  matching,    operation  3  is  not  a  null  operation,   but  rather 
is  used  to  transform  various  possible  input  pattern  configurations  to  a  standardized  master  reference 
pattern  template.     This  technique  is  proposed  by  Mauch  for  the  development  of  improved  reading  aids 
for  the  blind.     In  this  scheme,   the  initial  input  pattern  is  derived  from  the  source  pattern  by  means  of 
an  aperture  mask  optimally  designed  for  a  particular  type  font.     Then,   in  operation  3,   fiber  optics 
are  used  to  transform  this  pattern  into  another  which  is  processed  by  a  second  aperture  array  which 
has  a  standardized  arrangement  of  apertures,   a  reference  pattern  serving  as  master  for  more  than 
one  type  font.  _f/ 

Transformations  of  the  input  pattern  to  the  required  reference  pattern  format  typically  involve 
such  processing  steps  as  obtaining  and  recording  a  specific  sequence  of  criterial  area  coincidences 
or  of  zone  or  vector  crossings.     Included  would  be  operations  to  eliminate  redundancy,    such  as 
ignoring  scan  segments  which  give  the  same  results  or  scores  as  immediately  preceding  scans, 
ignoring  line    lengths  until  the  line -direction  changes,   and  the  like.     Two  examples  are  described 
below. 

The  first  example  is  that  of  a  check  reader  designed  by  the  Laboratory  for  Electronics  for  the 
Chase  Manhattan  Bank.     The  system  consists  of  the  following  elements: 

(1)  A  photoelectric  scanner  where  segments  of  the  character  to  be  identified  are 
examined  sequentially; 

(2)  An  encoding  unit  where  the  scanner  output  of  input  pattern  elements  is  manipulated 
in  accordance  with  built-in  logic; 

(3)  Shift  register  storage  where  the  encoded  data  for  the  input  pattern  elements  is  stored 
until  the  entire  source  pattern  has  been  scanned,    and 

(4)  A  decoder  unit  where  the  coincidence  of  a  specific  code  pattern  for  an  input  pattern 
with  the  code  pattern  for  one  of  the  reference  patterns  in  the  vocabulary  results  in 
positive  identification  of  the  unknown  character  represented  by  the  source  pattern. 

During  scanning  the  source  pattern  to  be  read  is  moved  past  a  lens  which  projects  a  magnified 
image  upon  a  column  of  photocells  sufficient  in  number  to  accommodate  the  height  of  the  image  plus 
allowance  for  variations  in  vertical  registry.     The  outputs  of  these  cells  are  modulated  by  black 
regions  caused  by  the  portion  of  the  source  pattern  in  the  image  field,   and  are  read  out  sequentially 
through  circuits  which  at  each  scan  time  produce  a  pulse  for  every  photocell  output  indicating  the 
detection  of  a  black  region  and  a  pulse  for  every  combination  of  photocell  outputs  that  indicates  the 
detection  of  a  long  black  region.     These  pulses  are  then  transmitted  to  the  encoding  unit  where 
counts  of  the  total  number  of  pulses  per  scan  and  of  the  number  of  long  black  pulses  per  scan  are 
combined  into  one  of  five  discrete  pulse  code  patterns.     The  encoded  results  of  each  scan  are  then 
transferred  to  shift  register  storage  subject  to  the  restrictions  that  the  first  scan  of  a  character  is 
not  used  for  recognition  and  that  only  those  scans  that  differ  from  the  immediately  preceding  scan 
are  sent  to  storage. 

These  restrictions  accomplish  several  purposes.     The  first  scan  (operation  1)  is  used  not  for 
recognition  but  to  trigger  the  recognition  process  for  the  succeeding  scans:     This  non-use  of  the  first 
scan  for  recognition  also  minimizes  uncertainties  in  horizontal  registry.     The  non-use  of  succeeding 
scans  that  are  identical  to  an  adjacent  preceding  scan  serves  to  limit  the  number  of  input  pattern 
elements  necessary  to  achieve  identification.     This  procedure  tends  to  eliminate  the  effect  of 
variations  in  the  width  of  the  printed  characters,   and  to  permit  that  document -handling  speed  need 
not  be  exactly  synchronized  with  the  scan  rate.     For  example,    if  the  document  speed  is  such  that  the 
movement  of  the  source  pattern  through  the  image  field  is  slower  than  the  scan  rate,    the  source 
pattern  merely  appears  wider  with  respect  to  the  scan  rate. 

Fitting  the  input  pattern  to  the  reference  pattern  format  provides,   as  in  several  other  systems, 
that  the  encoded  results  of  the  scans  are  systematically  shifted  through  storage  until  all  the  encoded 
results  are  in  known  positions  in  storage.     At  this  time  matching  identification  interrogation  is 
triggered.     A  diode  matrix  with  appropriate  gating  provides  means  for  built-in  identification  formula 


1/ 


"Our  future  masks  will  now  consist  of  two  parallel  plates,    each  having  a  pattern  of  openings 
connected  by  lucite  light  guides.     The  openings  in  the  plate  facing  the  photocells  have  a 
standard  arrangement,    while  the  openings  in  the  plate  facing  the  text  are  arranged  optimally 
for  each  type  print.     The  lucite  light  guides  between  the  plates  provide  a  considerable  amount 
of  freedom  regarding  the  mutual  arrangement  of  the  openings  in  the  two  plates.  "    Ref.    300, 
p.    5-6. 
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storage,   and  the  subsequent  operations  of  selection  and  decision.     Thus,   if  the  master  identification 
formula  for  the  numeral  "5"  requires  the  occurrence  of  two  pulses  and  one  long  pulse  on  the  first 
scan  after  the  registration  scan,   the  occurrence  on  the  next  non-identical  scan  of  three  pulses  and  no 
long  pulse,   and  the  occurrence  on  a  subsequent  non-identical  scan  of  two  pulses  and  one  long  pulse, 
and  the  encoded  elements  of  the  input  pattern  derived  from  the  scans  of  an  unknown  character 
(observed  identification  formula)  coincide  with  this'  pattern,   that  output  line  of  the  matrix  indicating 
identification  of  "5"  will  be  energized. 

A  second  example  of  input  pattern  to  reference  pattern  transformation  is  provided  in  a  system 
proposed  by  Greig  and  reported  by  Boni  in  a  study  on  Russian  printing  prepared  to  define  critical 
factors  for  automatic  character  readers  for  Cyrillic -alphabet  source  material.  ±J   This  is  an 
interesting  variation  on  the  encoded  coordinate  description  in  which  the  initial  input  pattern  is  a 
black-white  quantized  pattern  with  respect  to  the  superimposed  grid,   but  in  which  the  input  pattern  as 
transformed  to  the  required  reference  pattern  format  consists  of  encoding  the  number  of  black  cells 
in  each  row  and  in  each  column  of  the  array. 

In  this  method,   the  superimposed  grid  is  called  a  "shape  discriminator  sieve".     It  provides  for 
"articulation  of  characters  by  horizontal  levels  and  vertical  zones  into  spatial  elements  corresponding 
either  to  figure  elements  of  characters  or  background  elements  of  characters  and  the  areas  containing 
characters.  "_2/  Certain  of  the  horizontal  levels  comprise  a  "constant  band"  which  embraces  the 
normal  height  of  characters,    exclusive  of  ascenders  and  descenders.    3/  This  can  be  utilized  as  a 
guide  for  adjusting  to  the  actual  registration  of  individual  characters  and  character  lines. 

The  transformation  of  the  input  pattern  to  the  reference  pattern  format  results  in  a  code  string 
of  digits  representing  'black'  counts  for  each  successive  row  and  each  successive  column.     An 
additional  code  for  relative  character  width  is  an  optional  feature.     Using  both  row  and  column  codes, 
it  is  claimed  that  each  of  the  270  characters  of  a  linecasting-machine  family  of  Cyrillic  types  can  be 
discriminated,   provided  that  the  alphabets  involved  are  of  the  same  point-size  in  Roman,   italic,   and 
bold  face.     As  a  result  of  simulation  studies,   it  is  further  claimed  that  for  a  more  limited  vocabulary 
the  row  codes  alone  should  suffice,    specifically: 

"The  64  characters  of  American  Linotype  type  face  Russian  no.    3  of  reference  quality 
(i.e.  ,    images  derived  from  pattern-plate  impressions),    when  coded  according  to  their 
horizontal  parameter  values.  .  .  constitute  a  non-ambiguous  set;  therefore,   ignoring  the 
vertical  parameters.  .  .has  validity  for  a  universe  of  characters  to  be  recognized  in  which 
the  character  images  correspond  closely  in  form  to  reference -quality  images.  "  4/ 

Greig  and  Boni  note  that  such  a  universe  is  small,   but  they  suggest  the  possibilities  of  deriving  from 
a  number  of  input  samples  sufficient  statistical  information  to  provide  alternative  reference  patterns. 

Thus,   in  general,   operations   1,    2,   and  3  of  Figure   12,    serve  a  variety  of  purposes,    including 
the  sharpening  and  improving  of  the  character  image,   the  reduction  of  redundancy,   and  the  elimi- 
nation of  noise  and  irrelevant  details.     We  should  note,   however,   that  any  manipulation  of  the  basic 
input  data  loses  some  of  it.     For  example,   a  scheme  for  correcting  minor  breaks  or  omissions  in 
continuity  of  the  type  stroke  may  cause  a  "C"  to  become  an  "O"  if  there  is  a  speck  between  the  ends 
of  the  "C".     Similarly,   a  scheme  for  removing  specks  may  open  an  "O"  into  a  "C"  if  the  "O"  is  a 
little  thin  where  the  "C"  would  normally  be  open.     Hence,    while  such  processing  improves  the 
effectiveness  of  scanning  devices  with  respect  to  good  quality  material,    it  can  degrade  that  of  poor 
quality  and  even  for  good  quality  source  patterns  there  is  introduced  a  finite,    if  small,    additional 
probability  of  error. 

4.  2.  2    Matching- Recognition-Identification 

Once  the  appropriate  input  pattern  has  been  derived  from  the  source  pattern  on  the  carrier 
medium,   it  must  be  compared  with  one  or  more  reference  patterns.     In  particular,   the  elements  of 
the  input  pattern  are  to  be  compared  with  the  elements  of  the  reference  pattern  or  patterns. 
Operation  4  is  preparatory  to  the  matching  operations.     It  involves  a  setting  of  the  selection  of 
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Boni,    C.     "Russian  type  study,  "    Ref.    55.     Section  5  of  this  study  was  prepared  by  J.    Greig. 

Ref.    55,   p.    1-4. 

Note  that,   for  Cyrillic  characters,    even  upper  case  characters  may  involve  ascender  and 
descender  levels,   as  shown  in  Figure   14. 

Ref.    55,   p.    6-5. 
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reference  patterns  representing  the  vocabulary  so  as  to  compare  the  input  pattern  elements  in  order 
against  the  reference  pattern  elements  of  each  of  the  available  reference  patterns  in  turn.     The  order 
in  which  the  reference  patterns  are  selected  may  be  important  in  devices  using  recognition  techniques 
such  as  optical  matching  where  an  "E"  input  pattern  may  cover  an  "F"  reference  pattern  and  thus 
result  in  a  false  identification.     In  some  recognition  systems,   operation  4  of  Figure   12  (p.    34)  is 
automatically  repeated  until  all  reference  patterns  have  been  tried,    regardless  of  the  degree  of  match 
found  in  the  succeeding  process  step,   operation  5. 

Repetition  of  operation  4  for  all  possible  reference  patterns,    regardless  of  degree  of  match,    is 
particularly  important  in  systems  which  are  based  on  reference  patterns  that  provide  probabilistic 
bases  for  identification.     In  some  criterial  feature  analysis  systems,   for  example,    each  reference 
pattern  is  a  statistically  determined  value  for  certain  properties  of  the  expected  character,    such  as 
the  property  of  having  a  wholly  enclosed  white  area  (the  "lakes"  vs.    "inlets"  criterion  of 
Rochester),  Jyor  of  having  a  certain  hit-score  with  respect  to  specific  inspection-point  n-tuples  of 
the  Bledsoe -Browning  method.    2/   The  "rotating  raster"  scheme  described  by  Weeks  3/  is  another 
example.     Here  the  number  of  encountered  crossings  per  scan-line  comprise  the  input  pattern 
elements.     Reference  patterns  are  built  up  by  statistical  studies  of  sample  characters  which  may  vary 
both  in  size  and  in  details  of  shape,    so  that  the  reference  pattern  elements  are  the  probability  scores 
for  a  given  expected  character  for  each  crossing-count  and  scan-line  combination. 

For  each  scan-line  in  the  rotating  raster  system,   a  matrix  is  established  that  records,   in  each 
element,   the  probability  of  occurrence  of  the  number  of  crossings  indicated  by  the  column  if  the 
system  is  scanning  a  pattern  proper  to  the  character  indicated  by  the  row.     Next,   as  Weeks  describes 
the  process: 

"The  unknown  character  is  scanned  and  a  particular  number,    say    k,    of  crossings  is 
encountered.     Merely  by  inspecting  the  column,   k,   and  selecting  the  element  which  has 
the  largest  conditional  probability,   the  character  most  likely  to  give  this  particular  number 
of  crossings  for  this  particular  scan  is  determined.  "    The  process  must  be  repeated  for 
each  scan-line  to  derive  the  information  necessary  for  identification-decision. 

Operation  4  is  in  some  cases  either,    in  effect,    'wired-in',    or  is  a  null  operation.     The  reading 
system  may  provide  for  simultaneous  comparisons  of  the  input  pattern  with  all  the  reference  patterns 
of  a  given  vocabulary,   by  parallel  processing.     Various  techniques  for  parallel  processing  are 
utilized  in  readers  developed  by  Baird-Atomic,    Briggs  Associates,    Rabinow  Engineering,   and  RCA, 
for  example.     Systems  involving  simultaneous  parallel  processing  typically  have  a  closed  vocabulary 
at  any  given  time.     That  is,   the  only  character  patterns  that  are  recognizable  in  the  system  are  known 
in  advance.     The  reference  patterns  for  the  Briggs  Associates  character  recognition  developments, 
for  example,   are  determined  by  photographing  samples  of  the  permitted  characters  through  the  lenti- 
cular lens  array.     The  closed-vocabulary-requirement  is  found  in  any  of  the  exact  template -matching 
techniques.     In  principle,   of  course,   new  or  improved  template  reference  patterns  can  be  added  to 
these  systems,   but  there  is  a  limit  (e.g.  ,    100-200)  to  the  total  number  that  can  be  used  for  parallel 
processing  at  any  one  time. 

In  some  cases,   operation  4  for  selection  of  reference  patterns  is  in  effect  replaced  or 
supplemented  by  operations  to  repetitively  regenerate  reference  patterns  or  reference  pattern 
elements.     Cook,   for  example,   in  a  further  development  of  the  ideas  suggested  by  the  DOFL  First 
Reader   4y   ami  the  electronic  version  by  Greenough  and  Gordon,  ^/proposed  the  scanning  of  both  the 
unknown  source  pattern  and  each  of  the  character  vocabulary  masks  concurrently.  2.1    Thus  the 
subsequent  matching  is  no  longer  on  the  basis  of  optical  fit,    although  the  reference  patterns  are  of 
the  positive  optical  template  type,   but  rather  of  the  time -varying  signals  produced  as  input  pattern 
with  the  similar  signals  generated  for  each  of  the  vocabulary  characters  simultaneously,   the 
comparisons  being  made  in  parallel,   and  a  'best-fit'  decision  made  in  terms  of  zero  or  minimum 
mismatch.     An  earlier  example  of  this  principle  is  provided  in  the  patent  awarded  to  Ayres  in  1935, 
and  assigned  to  IBM.  J./ 
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In  operation  5,   the  one  or  more  elements  of  the  input  pattern  are  compared  either  serially  or  in 
parallel  with  the  corresponding  one  or  more  elements  of  a  selected  reference  pattern  or  patterns.     In 
a  system  where  the  input  pattern  is  not  segmented  or  quantized  the  input  pattern  element  is  the  input 
pattern  as  a  whole  which  is  compared  with  the  selected  reference  pattern  as  a  whole.     For  example, 
in  the  DOFL,  First  Reader,   the  character  image  as  a  whole  serves  as  the  input  pattern  which  is 
matched  with  a  reference  pattern  that  is  a  photographic  negative  image  of  one  of  the  vocabulary 
characters.     Systems  using  encoded  input  patterns,   as  the  number  of  black  crossovers  encountered  in 
one  scan  across  a   segment  of  the  character  image,    compare  such  pattern  elements  element-by- 
element  with  the  cross-overs  for  the  corresponding  scan-unit  element  of  the  vocabulary  reference 
pattern. 

In  still  other  cases,   a  single  master  reference  pattern  is  used  or  implied.     It  may  consist,   for 
example,    of  a  single  peephole -mosaic  mask  which  is  superimposed  on  the  black  and  white  pattern 
having  a  one-to-one  correspondence  with  the  inked-not  inked  areas  of  the  original  source  pattern.     It 
may  also  be  a  simple  function  table  decoder  such  that  for  any  given  input  of  a  pattern  consisting  of 
two-valued  pattern  elements  (signals)  one  and  only  one  output  can  be.energized.     For  example,   in  the 
reader  suggested  by  Stone  for  the  Rome  Air  Development  Center,  U   the  elements  of  the  input  pattern 
are  the  black-white  indications  for  each  of  the   13  specially  shaped  areas,   whereas,   in  effect,   the 
elements  of  each  reference  pattern,   are  the  black-or -white  requirements,   for  each  of  these  areas, 
for  each  character  symbol  allowed  in  the  vocabulary  of  the  system. 

The  output  of  the  matching  procedure  of  operation  5  is  controlled  by  threshold  values  for  the 
required  degree  of  match,   ranging  from  complete  and  exact  coincidence,   to  "best  fit"  comparisons, 
and  to  multiple  correlation  techniques.     Where  an  input  pattern  is  tried  in  turn  against  each  of  a 
number  of  different  patterns  in  the  reference  vocabulary  until  a  specified  degree  of  match  occurs, 
failure  to  match  at  operation  5  results  in  the  selection  of  another  reference  pattern  and  repetition  of 
operations  4,    5,   and  6  until  all  reference  patterns  have  been  tried.     Failure  to  match  any  of  the  avail- 
able reference  patterns  results  in  nonrecognition  with  appropriate  methods  for  indicating  a  reject. 

There  are,   however,    special  cases  where  the  failure  to  match  any  of  the  reference  patterns 
selected  in  operation  4,   or  in  iterated  selections,   will  serve  as  a  screening  device  to  determine  the 
nature  of  a  particular  font  that  is  involved.     In  the  Philco  reader  for  the  Post  Office,    such  screening 
may  be  used  to  readjust  the    height-width  ratio  for  individual  character  scanning.     For  such  systems 
as  the  parallel  template  matching  machines  where  operation  4  is  a  null  operation  in  the  initial  pass  of 
material  to  be  read,   this  screening  may  activate  a  special  type  of  operation  4  selection,   namely,   the 
replacement  of  the  initial  vocabulary  set  of  reference  patterns  with  another  set.     Thus,   more  than 
the  relatively  limited  number  of  reference  patterns  available  for  parallel  processing  can  be 
accommodated  in  reader  systems  which  allow  for  the  substitution  of  one  vocabulary  set  for  another. 

Results  of  the  element-by-element  matching  process  are  then  transformed  in  operation  7  of 
Figure  12  to  what  we  have  termed  an  "identification  formula".     This  formula  is  such  that  the  detected 
equivalence  of  input  pattern  elements  with  the  corresponding  elements  of  one  or  more  specific 
reference  patterns  can  be  checked,   as  necessary,   against  required  sequences  of  element  "hits"  and 
other  requirements,   for  example,   a  particular  system  may  require  that  the  element -by-element 
matches  should  have  occurred  in  specified  combinations.     The  identification  formula  is  thus  a  direct 
representation  of  the  various  'and',    'or',    'and-not1,   and  other  operators  of  a  particular  recognition 
logic. 

The  use  of  'and'  and  'or1  operators  at  various  levels  of  pattern  element  matching  and  identifica- 
tion formula  matching  is  exemplified  in  the  Solartron  ERA  system.     This  has  been  described  by 
Bailey,   in  part,   as  follows: 

"The  finally  cleaned  and  clipped  raster  is  time-gated  into  'features',    compared  with 
permanently  stored  features  of  recognizable  characters,   and  a  definite  black-or -white 
decision  made  for  each  of  these  features  by  a  bistable  circuit.     The  resulting  features 
are  stored  in  a  temporary  store- -a  convenient  time -translation  device  to  enable  parallel 
operations  to  be  done  on  the  features.     The  operations  are  combinations  of  AND  and  OR 
logic  in  cascade. 

"The  programming  of  the  logical  items,   in  the  form  of  a  block  of  connections,   is 
quite  complex.     The  principal  organization  of  this  process  is  that  in  which  a  number  of 
features  have  an  OR  criterion  applied  to  them,    so  that  a  black  (say)  in  one  will  register 
as  black  for  the  group;  then  groups  are  connected  with  an  AND  criterion  to  impulse  the 
output  corresponding  to  a  given  character,   when  and  only  when  all  groups  have  the 
appropriate  sign.     Other  more  complex  arrangements  are  superimposed  with  some 
attention  to  those  in  which  a  final  OR  combination  is  used. 
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'.'The  principal  characteristic  of  the  store  and  logic,  however,  is  that  the  information 
is  condensed  to  the  final  minimum  required  by  taking  combinations  of  'bits'  and  comparing 
them  by  means  of  an  assigned  code.  "  JV 

A  fairly  elaborate  example  of  the  use  of  a  number  of  several  different  operators  in  the  recog- 
nition-identification logic  is  provided  in  the  "proportional  parts"  method  for  character  recognition 
developed  by  Greanias  and  others  at  IBM.    2/   This  method,    which  was  simulated  and  extensively 
tested  on  the  IBM  650  computer,    is  designed  to  use  the  relative  number,   position,    and  size  of 
character  elements  occurring  in  a  specified  sequence  as  criteria  for  character  identification.     The 
proportional  parts  method  presupposes  that  the  source  pattern  is  derived  from  the  character  image 
by  a  flying  spot  scan  proceeding  from  top  to  bottom  and  from  right  to  left  across  the  image  field. 
Each  vertical  scan  is  subdivided  into  intervals  and  the  light  signals  for  each  interval  are  quantized 
as  black  or  as  white  in  accordance  with  selected  thresholds.     The  resultant  binary  data  for  each 
vertical  scan  (the  input  pattern  elements)  are  then  encoded  to  the  reference  pattern  format  consisting 
of  three  decimal  digits  which  represent  the  number,    position,    and  size  of  the  black  areas  in  that 
scan. 

The  reference  pattern  is  thus  a  single  master  covering  the  allowable  range  of  digit  values  in 
each  of  the  three  code  positions.     Examples  are,   for  the  first-digit  coding,    "1"  for  a  single  black 
area,    "4"  for  a  medium  length  black  above  a  short  black,    "5"  for  a  short  black  above  a  medium 
length  black,   and  the  like.     The  second  digit  indicates  the  relative  height  of  the  uppermost  black  area 
of  a  given  vertical  scan  as  compared  to  the  uppermost  black  area  of  the  third  previous  scan,   from 
"0"  for  no  change  to  "4"  from  a  three  unit  decrease.     Third  digit  codings  from  "0"  to  "7"  indicate 
measures  of  relative  length. 

The  observed  identification  formula  in  this  case  thus  consists  of  the  codes  for  the  input  pattern 
elements  derived  from  successive  vertical  scans.     Matching  of  the  observed  identification  formula 
with  the  master  identification  formulas  involves  a  logic  of  'and',    'or',    'and-not',   in  sequence  and  with 
specified  delay  (lag)  intervals.     That  is,   the  sequence  of  codes  selectively  derived  from  the  character 
which  has  been  sensed  is  checked  against  a  prescribed  code  sequence  for  each  vocabulary  character. 
This  prescribed  sequence  defines  (1)  the  allowable  codes  that  will  satisfy  a  given  stage  of  the 
identification  decision  process,    (2)  the  maximum  number  of  scans  that  can  occur  before  the  next 
stage  of  the  process  must  be  satisfied,   and  (3)  specific  inhibiting  conditions  or  codes  that  must  not 
occur  at  or  between  particular  stages  of  a  particular  sequence. 

Another  example  of  a  specialized  identification  formula  is  the  type  of  encoded  representation 
of  input  pattern  elements  that  results  from  some  of  the  criterial  feature  analysis  techniques.     A 
Russian  worker  in  the  field,    Blokh,    describes  this  type  of  formula  as  follows: 

"Whatever  characteristics  are  utilized  in  recognition,    each  combination  of  them, 
defining  a  given  image,    may  be  written  in  the  form  of  an  n- denominational  and  m-graded 
number,    in  which  n  is  the  number  of  characteristics,    and  values  0,    1,    2,    .  .  .  ,   m-1, 
represent  gradations  of  these  characteristics.     In  this  manner,   the  technical  device 
accomplishing  character  recognition  realizes  a  code  (with  base  m),   in  which  each  code 
combination  corresponds  to  one  image  of  a  given  set.  "  _f/ 

We  have  noted  previously  that  most  of  the  criterial  feature  analytic  methods  result  in  what  may 
appropriately  be  termed  values  with  respect  to  a  property  list.     Farley,   in  discussing  the  broader 
problems  of  pattern  recognition,    pattern  perception,    and  pattern  classification,   uses  this  termino- 
logy and,    moreover,    stresses  the  enormous  advantages  in  storage  economy  that  can  result  from  use 
of  identification  formulas  rather  than  template-type  reference  patterns.     For  example,    Farley 
states,   in  part: 

"The  idea  of  a  percept  as  an  association,    class,   or  'bundle'  of  properties  probably 
goes  back  at  least  as  far  as  Aristotle  and  has  reappeared  in  one  form  or  another  many 
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times  since  then.  .  .However,    the  idea  does  not  seem  to  have  been  put  forward  in  a  form 
definite  enough  to  indicate  how  it  might  eventually  evolve  into  a  model  of  perception.     There 
would  seem  to  be  sound  reasons  from  both  biological  and  psychological  considerations  for 
attacking  perception  from  this  standpoint.  .  .  . 

".  .  .    every  object,    or  (more  generally)  percept,   may  be  defined  by  a  list  of  the  properties 
it  possesses  and  the  observed  ranges  (or  better,   the  probability  distributions)  of  the  values  of 

each  property  .  .  .    Each  different  class  of  properties  makes  up  a  percept,    .  .  .    c±,   C2»    etc 

In  order  to  determine  to  which  percept  a  given  input  belongs,   it  is  only  necessary  to  measure 
its  properties  and  see  which  class  cj    it  fits  best.  .  . 

"One  interesting  advantage  of  this  model  is  the  vast  number  of  classes  c;  which  can  be 
distinguished  by  means  of  a  few  properties.     For  example,    if  one  hundred  measurable 
properties  are  available,   and  if  they  average  ten  distinguishable  distributions  of  value  each, 
then  the  number  of  distinguishable  classes  is  of  the  order  of  10*"".     A  great  deal  of  the 
'storage1  for  the  class  recognition  is  thus  accounted  for  by  computation  of  property  measure- 
ments which  are  used  over  and  over  again.  "  ±J 

The  observed  identification  formula  is  typically  compared  against  master  identification 
formulas  in  operation  8,   under  the  control  of  an  appropriate  recognition  threshold  setting.     If  the 
observed  identification  formula  matches  a  specific  master  identification  formula,   the  decision  is 
made  that  the  unknown  source  pattern  is  equivalent  to  this  one  of  the  patterns  of  the  master 
vocabulary.     If  it  does  not  match,   and  if  all  formula-to-formula  comparisons  have  not  been  made, 
a  new  master  formula  is  selected  and  operation  9  is  repeated.     The  ultimate  results  of  operation  9 
are  thus  either  a  positive  identification  followed  by  the  appropriate  steps  to  transform  the  identifica- 
tion decision  into  a  target  pattern  for  output,   operations   11,    12,   and  13,   or  a  rejection  cf  the  input 
pattern  as  unrecognized,    operation  14.     The  occurrenca  of  a  reject  may  lead  to  a  re -trial  of  the  input 
pattern  at  different  scanning,   matching,   or  recognition  threshold  values.  £./   Correlative  functions, 
such  as  use  of  indications  derived  from  context,    may  be  used  to  adjust  the  recognition  threshold. 

Particularly  interesting  examples  of  character  recognition  principles  which  illustrate  use  of 
character -feature  probabilities  and  context-dependency  adjustments,   although  not  primarily  involving 
design  of  a  practical  character  reader,   are  to  be  found  in  various  programs  involving  experiments  on 
machine  recognition  of  handprinted  alphabetic  characters.     The  work  of  Doyle  at  M.I.  T.   has  resulted 
in  computer  recognition,    correct  approximately   87%  of  the  time,   of  ''sloppy"  handprinted 
characters.    3/  Further  investigations  of  the  frequency  with  which  a  given  character  was  confused 
with  some  other  showed  that  many  of  the  incorrect  recognitions  were  for  character  pairs  ('A1,    'R') 
where  the  source  pattern  would  be  almost  equally  ambiguous  for  a  human  reader. 

The  emphases  in  the  Doyle  program  are  on  parallel  processing  of  the  observed  against  the 
master  identification  formulas  and  on  derivation  of  probabilities  of  occurrence  of  the  results  of 
criterial  feature  tests  by  extensive  testing  with  samples  of  the  characters  to  be  recognized.     The 
handprinted  characters,   whether  'teaching'  samples  or  unknowns  to  be  recognized,   were  constrained 
by  the  frame  within  which  they  were  printed  but  otherwise  were  often  badly  formed  and  noisy.     The 
input  pattern  for  each  character  was  quantized  in  a  32x32  array,    with  image  enhancement  processing 
for  each  3x3  sub-area  in  accordance  with  fill-and-delete  rules  suggested  by  Unger.  ZJ    Various 
aspects  of  this  system  have  been  described  by  Doyle  as  follows: 

"With  parallel  processing  all  tests  are  made  before  any  decision  is  undertaken. 
Though  the  individual  tests  may  be  poor  a  reliable  decision  can.  be  available  provided 
there  are  enough  tests,    each  contributing  some  different  fractional  bit  of  information.. 
Such  a  set  of  tests  may  be  considered  a  code  with  redundancy,   an  appropriate  counter 
to  noise  and  distortion.  .  . 
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Farley,   B.  G.     "Self-organizing  models  for  learned  perception.  "    Ref.    131,   pp  15-16,   passim. 
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"The  action  to  be  taken  following  a  particular  test  should  be  determined  by  observing  the 
results  of  testing  real  known  samples  from  the  population  of  characters  one  eventually  hopes 
to  be  able  to  recognize  rather  than  by  applying  preconceived  rules.  .  . 

"The  scheme  to  be  described  incorporates  both  these  notions --decision  reserved  until 
all  test  results  are  in,   and  discrimination  based  on  'learning'  from  real  data.  .  . 

"Symbols  representing  the  outcomes  of  various  tests  applied  to  each  sample.  .  .    (are)  .  .  . 
collected  in  a  table  .  .  .    which  contains  a  tally  of  the  number  of  occurrences  of  each  test 
outcome  for  each  pattern  .  .  . 

"Finally  the  observed  incidences  of  outcomes  are  combined  with  assumed  a  priori 
probabilities  of  occurrence  for  each  character  to  calculate  the  inverse  probabilities  for 
each  character  given  test  and  outcome.  .  . 

"The  .  .  .    (operation)  .  .  .    'Decide'  forms  weighted  averages  of  the  inverse  probabilities 
corresponding  to  the  observed  outcomes  from  the  separate  tests  and  indicates  which      ,  / 
character  corresponds  to  the  highest  of  these  averages  of  a  posteriori  probabilities.  "  — ' 

The  Bledsoe -Browning  approach  — '    also  uses  the  establishment  of  test  outcome  probability 
criteria  on  the  basis  of  experience  with  'teaching'  examples  of  the  expected  character  population. 
In  this  case  the  test  outcomes  are  the    hit-no-hit  patterns  for  the  input  character,   with  respect  to 
each  of  a  number  of  arbitrary  n-tuple  groupings  of  specific  cells  of  a  10x15  photocell  mosaic. 
Reference  patterns  are  built  up,   then,    of  the  observed  hit  patterns  for  each  of  the  n-tuple  groupings 
for  the  particular  character  in  a  particular  position.     Additional  reference  patterns  are  necessary  if 
wide  differences  in  translation,    rotation,    size,   and  other  variations  of  the  same  character  are  to  be 
accommodated.     For  an  unknown  source  pattern,   the  input  pattern  elements  (test  outcomes)  are 
derived  and  a  score  is  developed  with  respect  to  each  reference  pattern,   for  each  coincidence  of  a 
particular  hit-no-hit  outcome  with  the  same  outcome  in  the  reference  pattern.     When  such  scores 
have  been  obtained  for  all  members  of  the  reference  vocabulary,   the  highest  score  wins. 

Bledsoe  and  Browning  first  investigated   this  method  using  as  n-tuples  75  randomly  chosen 
exclusive  pairs  of  cells  of  the  mosaic.     Variations  included  obtaining  probability  values  by  averaging 
scores  for  several  different  'teaching'  alphabet  sets.     Another  variation  involved  consideration  of 
score  distributions  typical  of  each  character,   by  use  of  average  scores  from  additional  alphabets  as 
a  'secondary  experience'.  -  Still  another  variation  first  'taught'  the  system  the  scores  for  10  simplified 
shape  configurations,   —'   such  as  horizontal,    vertical,    and  diagonal  bars  or  strokes.     Then  several 
alphabets  were  compared  with  the  earlier  score  material  in  order  to  obtain  score  distributions  which 
in  turn  facilitated  distribution-comparison  decisions  for  new  characters. 

The  basic  technique  was  also  further  modified  to  provide  for  identification  of  alphabetic 
characters  with  respect  to  their  word  contexts.     The  length  of  an  unknown  word  is  first  established 
by  counting  the  number  of  characters  encountered  between  blank  spaces.     The  observed  score  for 
each  of  these  characters  is  then  obtained.     Next,    reference  is  made  to  that  portion  of  a  special 
dictionary  where  words  of  that  length  are  recorded.     The  scores  per  possible  character  identification 
that  coincide  with  the  characters  of  a  valid  word  are  added  in  proper  order,   and  the  word  with  the 
highest  total  score  wins. 

4.  2.  3    Target  Pattern  Selection  and  Output 

The  generalized  recognition  process  is  terminated  by  an  appropriate  action  following  identifica- 
tion,  usually  in  the  form  of  the  selection  and  output  of  a  target  pattern  that  is  the  equivalent  of  the 
original  source  pattern.     In  general,    operations   11,    12,    and  13  of  Figure   12  encompass  a  number  of 
well-known  techniques  for  the  conversion  of  a  specific  signal,    indicating  the  identification  of  a 
particular  character,   to  the  desired  output.     The  desired  output  may  be  the  activation  of  a  typewriter 
key,   the  punching  of  a  punched  card  code  pattern,   the  generation  of  the  aporopriate  symbol  code  in  a 
particular  machine -language  for  direct  input  to  computer,    or,    as  in  the  case  of  reading  aids  for  the 
blind,     appropriate  auditory  or  tactile  patterns  representing  that  character  in  accordance  with  a  pre- 
determined convention.     In  the  case  of   reading  aids  for  the  blind,   target  pattern  selection  may  also 
involve  holding  the  results  of  individual  character  decisions  until  an  entire  word  has  been  completely 
read  and  then  synthesizing  a  sound  pattern  representative  of  that  word  as  a  whole. 


— '  Doyle,    W.      "Recognition  of  sloppy,   hand-printed  characters,  "    Ref.  104,    p.    133-134. 

— '  Bledsoe,    W.  W.   and  I. Browning.     "Pattern  recognition  and  reading  by  machine,  "  Ref.    51. 
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Bledsoe  and  Browning  refer  to  these  as  "arbitrary"  shapes  in  the  same  sense  as  the  random 
pairing,  but  they  are  in  fact  closely  similar  to  those  used  for  extraction  of  critenal  features 
in  several  other  systems. 
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We  have  already  noted  the  possibility  that  both  transliteration  and  encrypting  or  decrypting 
operations  may  be  included  as  direct  by-products  of  the  recognition-identification  process.     These 
steps  would  of  course  require  additional  logic  to  accomplish  consistent  application  of  conventions 
adopted  _Vfor  target-pattern  selection  and  output.     This  transliteration  maybe  at  the  letter,   phoneme, 
or  word  level,    comprising,   for  the  latter  case,    in  effect  a  word-for-word  translation.     In 
demonstrations  for  the  International  Conference  on  Scientific  Information  held  in  Washington  in 
November    1958,   for  example,   audiences  were  requested  to  select  one  of  four  Russian  nouns  or  proper 
names  for  input  to  a  scanning  device,   SADIE,   and  to  a  computer -processing  look-up  technique 
programmed  for  SEAC,   one  of  the  pioneer  electronic  computers  developed  by  and  still  in  use  at  the 
National  Bureau  of  Standards.     The  individual  characters  of  the  selected  noun  were  then  fed  one  by  one 
into  the  machine -processing  system.     Each  was  identified  by  the  obviously  simple  and  quite  naive 
method  of  gross  black-area  discrimination  for  a  vocabulary  limited  to   14  Cyrillic  upper-case 
characters. 

As  each  character  was  recognized  it  was  assigned  an  appropriate  character-identification  code. 
When  a  blank,   white,    space  indicating  the  end  of  a  character -string  (or  word)  occurred,   the 
accumulative  identified-character-code  could  be  used  in  conjunction  with  a  computer  program  for  any 
of  the  following  operations: 

(1)  The  English-word  transliteration  of  the  Cyrillic -letter  string, 

(2)  The  'definition'  of  the  word,    such  that  for  the  proper  name  of  a  Russian  author,   the 
subject  could  be  identified  as:     "Author"    "Area  X",   }J   "Human",    etc.  ; 

(3)  The  extension  of  the  word,  e.g.  ,  the  extension  of  the  received  Cyrillic  character 
sequence  translatable  as  the  English  word  'chairman',  in  the  form  of  a  list  of  the 
names  of  the  'Panel  Chairmen'  for  the  ICSI  Area  discussions,   and 

(4)  Given  that  the  recognized  word  was  the  proper  name  of  either  a  Russian  Panel 
Chairman  or  of  an  ICSI  author  who  is  Russian,   the  'location'  of  the  person 
referred  to  by  that  proper  name,   as  "Russia".    3/ 

A  special  variation  of  the  operations  to  select  the  target  pattern  language,   to  transform  the 
identification  decision  to  the  selected  target  pattern,   and  to  provide  output  is  given  in  the  case  where 
the  recognition  process  controls  other  operations.     This  is  the  case  for  counting  or  sorting  of  objects 
which  have  alphanumeric  characters  as  iiriicia  for  action.     An  obvious  example  is  in  the  sorting  of 
first  class  mail.     LevyJ*/  has  been  a  pioneering  exponent  of  the  substitution  of  reading  machines  in 
mail-sorting  operations  for  the  Canadian  Post  Office  and  himself  holds  a  patent  for  an  intermediate 
step:    a  first  human  reading,   inscribing  of  special  address -destination  marks,   machine  reading  of 
these  marks  in  order  to  control  routing  to  bins  or  pockets  of  a  sorting  machine.  jV    The  U.S.    Post 
Office  Department  has  supported  automatic  reading  technique  developments  at  Farrington-IMR  for 
this  purpose,   and  has  recently  placed  new  contracts  both  with  Farrington  and  with  the  Research 
Division  of  the  Philco  Corporation.     The  latter  contract  is  for  a  reader  to  recognize  75  addresses  at 
anticipated  rates  of  1,  500  characters  per  second.  £y    The  target  pattern  output  in  these  cases  is  the 
routing  of  the  envelopes  on  which  these  addresses  appear  to  the  appropriate  sorting  bins.     The  German 
and  Dutch  Post  Offices  have  also  been  among  the  postal  service  organizations  that  have  demonstrated 
interest  in  these  possible  applications . 
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That  there  are  difficulties  in  the  establishment  and  use  of  such  conventions  is  evidenced  by 
variable  transliterations  of  Russian  proper,  names  which  occurred  in  the  bibliographical 
research  related  to  this  report.     Thus  we  have  found  both  'Charkevitch'  and'Kharvevitch',   and 
'Fain'  and  'Fayn'  for  certain  Russian  authors  who  have  been  cited.     This  problem  of  trans- 
literation has  been  aggravated  by  the  very  diversity  of  schemes  that  have  been  used.     See,   for 
example,   Razran,    G.     "Transliteration  of  Russian",    Ref.    374. 

The  ICSI  program  was  divided  into  7  specific  subject  matter  'Areas',   and  Discussion  Panels 
were  selected  for  each  such  Area. 

For  a  discussion  of  the  computer  programs  involved,    see  Stevens,    M.  E.     "A  machine  model  of 
recall",   Ref.   461. 

Levy,    M.  M.     "The  electronic  aspects  of  the  Canadian  sorting  of  mail  system",    Ref.    270. 

Levy,    M.  M.     U.S.    Patent  2,  925,  586,    Ref.  271. 

"All-electronic  reader",   Ref.    6,   p.    14. 
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Thus,   many  different  combinations  of  scanning  techniques,    recognition  logic,   and  target  pattern 
output  have  been  proposed  for  carrying  out  the  process  steps  of  character  recognition.     These  various 
combinations  reflect  different  possibilities  for  meeting  the  requirements  of  a  particular  proposed 
application  of  automatic  reading  techniques.     The  operational  requirements  in  a  particular  situation 
are  therefore  of  considerable  importance  not  only  in  the  development  of  the  performance  specifications 
for  a  system  but  also  in  initial  appraisal  of  feasibility  and  in  system  design  if  new  equipment  is  to  be 
developed  to  meet  specific  user  needs. 

5.     OPERATIONAL  REQUIREMENTS  IN  AUTOMATIC  CHARACTER  RECOGNITION 

The  factors  that  are  critical  for  the  development  of  performance  specifications  for  any  given 
application  of  automatic  character  recognition  techniques  are  directly  related  to  the  objectives  that 
are  to  be  served  in  that  intended  application.    Similarly,    the  operational  requirements  for  a  character 
recognition  device  are  related  to  the  characteristics  of  the  situation  in  which  it  is  to  be  applied  and  to 
the  standards  of  reference  selected  for  performance  measurement.     The  evaluation  of  a  given 
character  recognition  system  is  therefore  directed  as  much  or  more  to  these  factors  as  to  the  logical 
and  mechanical  characteristics  of  the  system  or  device  itself.     For  example,   in  such  massive  paper- 
work'activities  as  the  keypunching  of  individual  wage  earnings  data  from  employer  tax  returns  for 
social  security  accounting,   a  steadily  increasing  work  load  may  outstrip  available  manpower.     There 
may  be  a  high  rate  of  turnover  among  personnel  who  perform  the  necessary  keyboard  operations,   and 
a  high  training  and  replacement  cost.     In  such  situations  the  basic  management  objective  in  looking 
toward  automatic  reading  devices    is  to  meet  present  manual  output  standards  for  an  increased  volume 
of  work.     That  objective  would  be  of  greater  significance  than  increasing  the  speed  or  the  accuracy  or 
decreasing  the  cost  of  the  transcription  operations,   although  obtaining  these  latter  advantages  would 
also  be  desirable. 

Another  major  management  objective  in  appraising  the  feasibility  of  automatic  character 
reading  techniques  might  well  be  that  of  achieving  truly  integrated  data  processing,   from  data 
origination  to  ultimate  use  of  processed  output.     Thus,   as  Hattery  has  noted,    "...   Automatic 
character  scanning  or  reading  ....    promises  to  give  a  new  dimension  to  the  revolution  which  is 
already  taking  place  with  the  introduction  of  electronic  computers.  "  ±j    We  have  already  remarked 
on  the  appetite  of  automatic  data  processing  systems  for  large  quantities  of  input,    which  can  then  be 
processed  at  rates  far  out  of  balance  with  manual  methods  of  data  preparation  and  insertion.     An 
eminently  reasonable  objective,   therefore,   in  considering  the  possible  applicability  of  character 
recognition  devices  is  that  of  restoring  the  input-processing  balance  where  automatic  data 
processing  equipment  is  used.     Even  more  significantly,   the  objective  may  be  to  make  the  use  of 
such  equipment  profitable.   2/ 

Some  of  the  critical  factors  useful  in  evaluating  the  feasibility  of  using  automatic  character 
reading  techniques  are  discussed  below:    first,    in  terms  of  overall  requirements  based  upon 
objectives  and  characteristics  of  the  processing  situation;  secondly,   in  terms  of  specific  require- 
ments such  as  the  characteristics  of  the  initial  recording  medium  and  the  font  or  type  faces  used; 
and,   thirdly,   in  terms  of  criteria  for  performance  measurement.     Special  difficulties  with  type- 
written and  foreign  language  materials  will  also  be  discussed. 

5.  1    Overall  Requirements 

Overall  system  requirements  that  determine  what  factors  will  be  most  significant  in  evaluating 
automatic  character  recognition  techniques  for  a  particular  application  involve  the  various  stages  of 
data  origination,   transmission,    receipt,    input  to  the  reading-recognition  process,   output,    storage, 
and  subsequent  use.     In  any  specific  case,   the  pertinency  and  weight  to  be  assigned  to  the  various 
factors  must  be  determined  by  thorough  fact-finding  and  analysis  of  present  and  desired  conditions 
and  procedures.     In  general,    possibilities  for  maintaining  high  quality  input,    limiting  the  size  of  the 
vocabulary,   handling  carrier  items  efficiently,   and  meeting  realistic  reliability  requirements, 
should  outweigh  questions  of  position-dependent  as  versus  shape -dependent  recognition  logic, 
permissible  reject  rates,   and  speed  or  rate  of  recognition.     In  particular,    we  should  note  that  the 
accuracy  of  the  recognition  system  output  cannot  be  expected  to  exceed  the  accuracy  of  the  data  or 
information  as  it  is  initially  generated  and  initially  recorded. 

It  is  significant  of  the  present  state  of  the  automatic  character  recognition  art  that  all  of  the 
reader  devices  that  have  actually  been  built  and  subjected  to  extensive  tests,    whether  in  the  field  or 
in  the  laboratory,   have  had  relatively  limited  vocabularies.     Vocabulary  sizes  that  have  been  tested 
range  from  10  to  16  symbols  (for  Arabic  numerals  and  a  few    special  characters)  up  to  about  100. 


1/ 
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Hattery,    L.  H.     "Automatic  character  reading.  ..  ",    Ref .  200,    p.    159. 

Compare,  for  example,  the  following  view:  "Conversion  from  the  human  to  the  machine 
has  in  many  cases  so  increased  the  cost  of  the  total  system  that  the  resulting  processing 
efficiencies  have  not  been  warranted  on  a  dollar -and- cents  basis.  "    (Ref.    346,   p.    28.) 


68 


Some  of  the  criterial  analysis  techniques  that  have  been  tested,    such  as  those  which  analyze  the 
number,   position  and  relative  size  of  character  strokes,   are  relatively  independent  of  minor 
variations  in  several  different  but  similar  type  styles.     Other  systems  permit  reprogramming  of 
identification  formulas  for  a  new  type  face,   as  by  re -wiring  of  a  plugboard  control  panel,    or  the 
insertion  of  a  different  set  of  vocabulary  masks,   or  a  different  master  mask. 

Difficulties  are  apparent  with  all  of  the  techniques  so  far  proposed  in  equipment  under   actual 
development,   in  terms  of  idiosyncracies  of  poor  quality,    especially  typed,    material.      Problems  in 
horizontal  registry  may  be  overcome  to  some  extent  by  utilizing  the  first  of  a  series  of  Scans  not 
to   recognize  the  source  pattern  but  to  trigger  the  recognition  process  so  that  the  input  pattern  is  not 
fed  in  until  the  leading  edge  of  the  source    pattern  has  been  detected.     This  principle  is  employed,   for 
example,    in  one  of  Laboratory  for  Electronics  readers.     A  positioning  mask  is  used  to  similar  effect 
in  both  the  Cook  Typereader  proposal  and  in  a  Rabinow  reader.     However,   the  use  of  the  first  scan 
for  registering  purposes,   or  an  alternative  technique  of  'rolling'  through  a  shift  register,   is  im- 
practical in  many  cases  where  there  is  extensive  overlap  and  running  together  of  characters. 

Even  more  significant  than  the  limitation  of  presently  available  reader  systems  to  applications 
having  restricted  vocabularies,   however,   are  the  restrictions  on  the  size,   format,   and  variety  of 
carrier  items  so  far  successfully  handled  in  actual  devices.     For  example  discussions  with  Control 
Instruments  personnel  during  our   1957  survey  brought  out  the  facts  that  in  their  investigations 
such  factors  are  frequently  more  critical  than  are  details  of  particular  recognition  logics.     They 
mentioned  such  problems  as  variable  opacity  of  the  scanning  slit,    scanning  photocell  (white)  noise, 
maintenance  of  constant  speed  during  carrier  item  feeding,   lack  of  exact  reproducibility  of  character 
image  even  from  the  same  printing  device,   positioning  during  scan,   tilt,   and  variations  in  lighting 
source  for  tilted  material.     It  is  also  noteworthy  that  present  paper  feed  limitations  restrict 
practical  recognition  rates  (i.e.  ,   of  several  thousand  characters  per  second  in  several  devices  that 
have  been  tested)  to  one-sixth  to  one-tenth  of  the  rates  theoretically  possible  with  present  scanning 
rates,   if  the  printed  page  must  be  used.     Prior  microfilming  of  the  input  material  may  result  in 
higher  effective  rates,   but  is  in  some  cases  not  feasible. 

Consideration  of  the  factors  involved  in  the  operations  of  data  origination  includes  determination 
of  the  answers  to  such  questions  as  the  number  of  different  sources  of  information.     This  question  is 
in  turn  affected  by  whether  or  not  such  sources  are  under  the  direct  or  indirect  administrative  control 
of  the  agency  which  subsequently  processes  that  information.     In  the  case  of  material  gathered  or 
intercepted  for  intelligence  purposes,   there  may  be  a  wide  variety  of  sources  over  which  there  is  no 
control.     On  the  other  hand,    in  the  case  of  traveler's  checks,   all  the  information  to  be  read  by 
machine  may  be  fully  controlled  by  preprinting  of  material  issued  only  by  the  organization  that  will 
later  process  the  checks  when  they  have  been  used. 

In  many  Government  applications,   the  turn-around  document  presents  special  problems  in  that 
the  degree  of  control  varies  widely  for  different  information  that  appears  on  one  and  the  same 
document.     On  U.S.   Savings  Bonds,   for  example,   there  is  preprinted  information  completely  under 
control,   but  there  is  also  information  as  to  the  date  of  issue  that  is     stamped  and  the  name  and 
address  of  the  purchaser  that  may  be  typed  or  even  handwritten  by  the  thousands  of  issuing  agents 
throughout  the  country.     Census  questionnaires,   tax  returns,   marketing  certificates,   and  other  forms 
provide  additional  examples  of  applications  in  which,   for  a  single  item,   there  is  a  mixed  degree  of 
prior  control  and  there  are  mixtures  of  inscriptions  of  information,   often  from  various  different 
sources.     Questions  on  the  extent  of  control  that  is  possible  in  the  origination  of  data  will  therefore 
also  involve  the  average  volume  of  messages  that  are  generated  per  source,   the  average  length  of 
such  messages,   the  different  types  of  messages,   uniformity  of  the  subject  matter  of  various 
messages,   and  the'  rates  of  message  production  per  type  per  message  source. 

Factors  relating  to  the  methods  of  inscription  at  a  message  generating  source  typically  cover 
the  following: 

(1)  The  inscription  equipment  and  its  accuracy; 

(2)  Rates  at  which  inscription  proceeds; 

(3)  Characteristics  such  as  size,    shape,   thickness,   uniformity,    recording  density,   etc.  , 
of  the  carrier  medium  or  media,   and 

(4)  The  quality  of  the  original  inscription  in  such  terms  as,   for  example,   uniformity  of 
ink  density,   placement,   alignment,   and  registration. 

For  example,   Heasley  has  reported  typical  inscription  irregularities  in  several  of  the  Farrington- 
IMR  installations  as  follows: 

"...    The  encoder  is  found  to  degrade  the  symbol  shapes  by  variation  in  strength  of 
impression  and  in  amount  of  inking,   and  by  inking  noise.     On  typical  business  equipment, 
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these  factors  may  result  in  line-width  variations  of  five  to  one,   missing  portions  in 
light  impressions,   blotted  portions  of  heavy    characters,    random  additional  interference 
on  either  light  or  dark  characters,   and  a  pronounced  ribbon  pattern  .  .  .    The  handling 
process  further  degrades  the  symbol  by  superimposing  interference.  "  _1/ 

The  incidence  of  accidental  noise  superimposed  on  the  message  due  to  handling  should  thus  also 
be  considered. 

Questions  of  accuracy  pertinent  to  data  origination  operations  include  the  accuracy  of  the 
original  data.     Accuracy  of  initial  inscription  involves  both  human  and  mechanical  error  rates  (e.g.  , 
human  transposition  of  two  or  more  symbols,   on  the  one  hand,   and  mechanical  failure  such  as  failure 
of  a  keystroke  operation  to  record  the  proper  impression,   on  the  other). 

Closely  similar  considerations  are  involved  in  the  various  stages  of  data  transmission  and  data 
receipt.     Error  rates  may  be  particularly  significant,   as  for  example  in  teletype  transmission,    rates 
running  as  high  as  30%.     In  addition,   factors  in  data  transmission  and  receipt  include  consideration  of 
the  number  of  transmission  channels  available,   and  the  traffic  capacity  per  channel.     Other  possible 
factors  are  the  tolerance  for  missing  or  erroneous  information,   the  readiness  with  which  the  receipt 
of  garbled  messages  can  be  detected,   the  possibilities  for  repeat  transmissions,   and  the  tolerance  for 
delays  in  receipt  or  for  delays  in  processing  after  receipt,   i.e.  ,   a  "stack  up"  tolerance. 

Factors  involved  in  the  operations  of  actual  input  to  a  character  recognition  device  involve  both 
system  requirements  and  specific  requirements  inherent  in  particular  problems.     The  former  include 
such  considerations  as  the  kind  and  extent  of  preparatory  operations  such  as  culling  of  illegible 
material.     In  particular,   possibilities  for  sorting  the  input  material  by  document  size,   by  quality  of 
carrier  medium  (e.g.  ,   bond  paper  as  against  newsprint  paper),   by  font  or  type-face  or  size  of  type, 
and  the  like,    should  be  carefully  considered.     Pre-editing  operations  such  as  handmarking  of  the 
beginning  and  end  of  passages  to  be  read  or  the  masking  out  of  interpersed  graphic  material  would 
also  be  included.     Related  system  requirements  at  the  initial  input  stage  include  the  time  and  costs 
of  manpower  involved  in  handling  and  preparatory  operations,   the  percentage  of  material  culled 
because  of  illegibility  and  the  costs  of  obtaining  correct  re-transcriptions  of  such  material,   and  the 
requirements  and  costs  of  manually  processing  the  material  that  is  insufficiently  legible  to  be  read 
by  machine. 

Output,    storage,   and  subsequent  usage  considerations  with  respect  to  material  to  be  read 
similarly  involve  questions  of  the  quantity,   rate,   and  quality  of  output,    with  specific  attention  to 
acceptable  reject  (non- recognition)  and  error  (misrecognition)  rates.     Also  involved  at  the  output 
stage  are  questions  of  the  characteristics  of  the  output  language  or  languages.     Whether  the  output 
feeds  other  processing  equipment  as  an  on-line  process,   and  if  so  at  what  required  rates,   are  also 
questions  that  must  be  considered.     Format  and  other  characteristics  of  the  output  and  storage 
recording  media  are  pertinent.     Additional  storage  and  usage  factors  include  considerations  of 
desired  quality  of  output  such  as  legibility  for  human  inspection,    considerations  of  economic  storage 
density,   and  requirements  for   subsequent  retrieval  and  re-use  of  material  that  has  been  read. 

The  interaction  of  the  overall  objectives  and  the  specific  operating  conditions  in  a  particular 
proposed  application  forms  the  basis,   first,    for  feasibility  evaluation  and,    secondly,   for  determi- 
nation of  performance  specifications.     Thus  such  factors  may  determine  the  minimum  acceptable 
reading  rate  necessary  to  keep  up  with  a  given  flow  of  input  information  and  a  desired  output  rate, 
whereas,  the  specific  requirements,   to  be  discussed  below,    determine  design  specifications  such  as 
the  tolerances  for  space  between  lines  or  the  rate  of  scan  necessary  to  achieve  recognition  of  a 
specified  vocabulary  at  the  desired  reading  rate. 

Patterns  of  critical  performance  factors  derived  from  analysis  of  overall  requirements  may 
be  exemplified  by  the  hypothetical  application  of  character  recognition  techniques  to  the  automatic 
dictionary  problem  as  follows: 

It  is  assumed  that  an  extensive  set  of  words  or  symbols  in  some  one  language  or  code  format, 
together  with  a  set  of  alternate  meanings  or  equivalent  code  representations  for  each  word,   in  the 
same  or  in  another  language,   are  to  be  established.     The  processing  problem  is  to  take  incoming 
words  in  a  source  language  and  to  match  them  to  the  corresponding  stored  words  in  the  dictionary. 
Upon  matching,    we  wish  to  receive  as  output  the  equivalent  word,    meaning,   or  code  representation 
in  the  target  language.   Automatic  character  recognition  techniques  might  be  applicable  in  initial 
establishment  of  the  reference  or  dictionary  file,   in  the  preparation  of  source  words  for  input  to  the 
mechanized  look-up  process  and,   as  in  the  case  where  the  dictionary  is  used  for  preparation  of 
encoded  messages,   in  automatic  proofreading  of  the  output  results. 


— '         Heasley,   C.  C.  ,   Jr.     "Some  communication  aspects  of  character  sensing  systems",   Ref.   205, 
p.    176. 
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Initial  establishment  of  the  reference  file  would  be  a  one-time  task  concerned  with  only  one 
or  a  few  sources  of  data  origination;  e.  g.  ,   an  existing  bilingual  dictionary  to  be  transcribed  to 
machine -usable  form  or  the  preparation  of  a  master  word  list  of  selected  terms  in  the  source 
language,   together  with  one  or  more  target  language  equivalents  of  these  selected  terms.     In  the 
case  of  transcribing  an  existing  printed  or  typed  dictionary,   control  of  the  size  and    style  of  type  can 
be  exercised  only  to  the  extent  of  selection  from  the  available  texts.     However,   limitation  to  a  single 
font  or  type  style  can  be  controlled  by  this  selection.     Questions  of  rate  of  generation  of  information, 
volume  of  messages  per  originating  source,    etc.  ,   are    irrelevant. 

The  vocabulary  of  character  symbols  to  be  used  includes  at  least  the  number  of  characters  in 
the  alphabet  of  the  source  language  plus  the  number  of  different  character  symbols  that  occur  in  the 
target  language.     For  example,   for  a  Russian-English  dictionary  printed  entirely  in  upper  case, 
32  plus   15  characters  would  be  required,    11  characters  in  the  English  alphabet  having  identical 
character  forms  in  the  Russian.     For  the  same  dictionary  using  both  upper  and  lower  case,   the 
symbol  vocabulary  would  be  at  least  64  plus  32  characters  forms  since  "veh"  (B)  and  "en"  (H)  have 
different  lower  case  forms  in  the  two  alphabets.     The  accuracy  of  the  original  data  should  be  high, 
and  in  most  cases  it  would  be  very  important  to  maintain  this  original  accuracy  in  the  transcription 
process.     The  rate  by  which  reading -transcription  operations  proceed  for  a  one-time  application 
would  presumably  not  be  a  significant  factor. 

The  choice  between  manual  and  automatic  methods  for  a  one-time  reading  transcription  process 
would  thus  largely  depend  upon  the  following: 

(1)  The  relative  economics  of  manual-keyboard-transcription  operations  as    against 
investment  in  a  reader  device  with  a  vocabulary  that  would  be  larger  (e.  g.  ,   the  96 
upper  and  lower  case  characters  for  the  second  Russian-English  example  above  in 
at  least  several  different  sizes)  J/than  any  available  in  devices  as  yet  in  production 
operation, 

(2)  the  possibilities  for  subsequent  utilization  of  either  the  reader  equipment  or  the  keyboard 
operators  after  the  one-time  task  has  been  completed, 

(3)  the  relative  accuracy  of  the  final  transcriptions  where  in  the  one  case  the  automatic 
rejection  of  marginal  recognitions  might  be  quite  closely  controlled,   whereas  in  the 
other  case  human  error  rates  would  presumably  increase  in  proportion  to  lack  of 
familiarity  with  the  foreign  language  characters,   and 

(4)  the  relative  availability  of  keyboard  operator  personnel  or  of  equipment  capable  of 
recognizing  the  total  symbol  vocabulary  with  a  degree  of  accuracy  at  least  equivalent 
to  that  obtainable  by  keyboard  plus  verification  operations. 

If.however,   the  initial  reference  file  were  to  be  based  upon  the  selection  and  assembly  of  the 
word  list  from  a  variety  of  sources,    some  form  of  manual  transcription  would  be  required  in  pre- 
paring the  master  list  in  almost  all  cases.     The  relative  cost  factor  would    therefore  need  also  to 
include  the  comparative  costs  of  keyboard  operation  where  data  are  simultaneously  produced  in 
machine -usable  form  as  on  punched  cards,   punched  paper  tape,   or  magnetic  tape. 

Subsequent  use  of  automatic  character  recognition  techniques  for  input  of  material  to  be 
looked  up  in  the  dictionary  or  proofreading  of  output  would  improve  the  possibilities  for  economical 
use  and  amortization  of  capital  investment  in  the  automatic  reading  equipment.     On  the  other  hand, 
subsequent  usage    planning  would  require  additional  consideration  of  rate,  volume,   and  time  factors 
to  keep  up  with  rate  of  receipt  of  such  input  material.     For  an  example  of  probable  minimum  input 
rates,    some  organizations  engaged  in  manual  translations  have  had  an  input  volume  of  approximately 
20,  000  Russian  journal  pages  each  year.     In  the  case  of  input  preparation,   use  of  automatic  tech- 
niques might  further  aggravate  the  problem  of  size  of  vocabulary  available  since  the  input  materials 
might  originate  from  a  variety  of  sources  in  a  variety  of  formats  and  type  styles. 

The  critical  questions  that  are  raised  from  the  interplay  of  overall  requirements  with  overall 
system  performance  specifications,   whether  in  the  above  simple  situation  or  in  situations  where 
hundreds  of  manual  operators  deal  daily  with  tens  of  thousands  of  items  from  varied  sources  and  in 
varied  forms,    can  be  reduced  to  the  following  major  points: 

(1)  What  is  the  economic  break-even  point  for  the  use  of  an  automatic  character  reader, 
of  given  performance  capabilities,  for  all  or  for  a  selected  part  of  the  work  load,  in 
the  light  of  the  basic  system  objectives? 


V 


We  note,   further,   that  in  many  type  styles,   differences  in  size  are  accompanied  by 
changes  in  proportionality  of  character  strokes  and  in  differences  other  than  those 
of  one-for-one  reduction. 
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(2)  What  are  the  possibilities,   for  one  or  more  different  reading -transcription  techniques 
(human  or  machine,   or  both),   for  meeting  the  operational  requirements  of  flexibility 
with  respect  to  either  present  or  future  demands? 

(3)  What  can  be  done,    in  the  total  system,   in  the  light  of  basic  objectives,   towards 
standardization  of  forms,   formatting,    standardized  or  specialized  fonts,   administrative 
control  of  quality  of  input,   to  facilitate  the  use  of  available  character  readers  at  a 
practical  and  economically-profitable  level? 

To  take  up  these  points  in  reverse  order,   we  note  first  that  where  management  decides  on  a 
self-checking  specially -de  signed  font,   as  in  charge -a-plate  usage  in  the  oil  industry,   an  apparently 
efficient  solution  has  already  been  found.     The  second  major  question  relates,   in  the  first  place,   to  the 
need  for  precise  operational  analysis  with  respect  to  real  requirements  in  a  particular  situation.     It 
relates,    in  the  second  place,   to  the  possibility  of  accepting  machine-recognition-processing  for  a 
designated  part  of  the  total  workload.     It  is  thus  itself,   as  a  major  question,    related  to  the  first  major 
point,   the  question  of  economic  break-even  point  for  some  designated  part  (all  or  a  selected  percent- 
age) of  the  total  workload.     However,   it  also  relates  to  the  question  of  using  several  different 
techniques  (different  readers,   at  different  costs  and  with  different  performance  capabilities)  in 
combination  (including  the  combination  of  people  and  machines)  to  achieve  the  total  system  goal. 

It  is  the  first  question,   that  of  economic  break-through,    which  is  of  crucial  importance  with 
respect  to  the  presently-practical  use  of  presently-available  automatic  reading  techniques.     If,   after 
careful  consideration  of  objectives  and  after  detailed  appraisal  of  operational  requirements,   it  is 
possible  to  isolate  a  segment  of  workload,   amenable  to  strict  control  over  original  data  preparation 
and  perhaps  susceptible  to  use  of  a  specialized,   limited-vocabulary  font,   then  character  readers  of  a 
type  already  in  use  may  provide  both  considerable  cost-reductions  and  considerable  savings  in  speed 
of  overall  processing. 

With  due  regard,   however,    to  the  previously  mentioned  factors  of  (1)  maintenance  of  high 
quality  input,    (2)  vocabulary,    (3)  carrier  handling,    (4)  reliability  requirements,   and  (5)  flexibility  in 
adjustments  for  changing  requirements,   the  overall  question  of  feasibility  is  generally  one  that 
depends  upon  balancing  the  comparative  speed  of  machine  as  against  human  target  pattern  output,   at 
an  acceptable  reject  rate,   with  an  appropriate  safety  factor,   in  terms  of  comparative  costs.     We 
shall  consider  this  point  in  more  detail  in  terms  of  criteria  for  performance  measurement,   after 
considering  examples  of  more  specific  operational  requirements  and  of  difficulties  with  special 
classes  of  materials,   below. 

5.  2    Specific  Requirements 

In  addition  to  overall  requirements  as  related  to  automatic  reading  techniques,   there  are  a 
number  of  specific  requirements.     These  set  the  appropriate  factors  for  comparative  evaluation 
which  arise  from  the  difficulties  inherent  in  the  scanning-recognition  process.     The  carrier  medium, 
such  as  the  paper  document  on  which  the  data  to  be  read  is  printed  or  typed,    must  first  of  all  be 
conveyed  to  a  scanning  station.     The  rate  of  feeding  the  carrier  items  must  be  at  least  commensurate 
with  the  rate  at  which  the  average  number  of  characters  that  can  be  recorded  on  such  items  can  be 
recognized.     Otherwise,   the  reading  rate  that  can  be  realized  will  be  limited  to  the  carrier-feeding 
rate.     Feeding  can  be  accomplished  either  by  manual  insertion,    which  is  comparatively  slow,   or  by 
automatic  paper  handling  equipment  which  may  be  both  expensive  and  difficult  to  maintain  in  proper 
operating  order.     Quite  complicated  machinery  may  be  required  to  maintain  regularity  in  paper 
feeding  unless  similar  automatic  feeding  techniques  have  been  used  in  the  initial  printing,  _l/as  for 
example,   in  the  case  of  the  sprocket  holes  for  continuous  form-feed  teletype  or  tabulator  output. 

Carrier  feed  requirements  will  typically  involve  questions  of  handling  documents  of  different 
dimensions  and  shapes.     Adequacy  of  means  to  assure  one-at-a-time  pickups  of  carrier  items  from 
a  feed  hopper,   and  adequacy  of  means  to  prevent  the  interference  of  one  carrier  item  with  another 
as  items  are  conveyed  to  the  scanning  station,   are  to  be  considered  as  factors  affecting  the  paper- 
feed  rate.     In  some  cases,   as  in  taking  of  physical  inventory  where  labels  or  tags  are  to  be  read  at 
the  site  of  storage  or  as  in  reading  aids  for  the  blind,    the  scanner-reader  device  is  brought  to  the 
material,   imposing  the  requirement  that  it  be  portable.     In  situations  where  the  input  material  that 
is  to  be  read  is  also  to  be  permanently  retained,   a  preliminary  process  of  microfilming  may  result 
in  simplification  of  some  of  the  carrier  feed  problems,   but  it  may  also  introduce  other  factors  of 
cost,   process  control,   and  the  like. 

Carrier -handling  problems  will  also  include  the  positioning  of  the  carrier  item  at  .the  scanning 
stations  so  that,   for  example,    it  is  not  tilted  and  so  that  all  information  on  the  item  can  be  read. 


II 


See  Cook,   H.  D.     "A  study  of  print  reading  systems",   Ref.    80. 
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Assuming  the  carrier  item  itself  (such  as  the  paper  check,   punched  card,    or  page  of  printed  or  typed 
material)  to  be  accurately  positioned,   additional  positioning,    either  by  movement  of  the  carrier  or  by 
movement  of  the  scanning  mechanism  or  both,    is  required  to  locate  each  unit  of  information  that  is  to 
be  sensed  in  its  entirety  in  a  scan  cycle.     These  mic  repositioning  requirements  are  related  both  to  the 
equipment  characteristics  in  positioning  and  in  scanning  and  to  the  registration  of  the  information  on 
the  carrier  medium.     It  has  been    estimated,   for  example,   that  tolerances  from  exact  register  in 
order  for  ordinary  printed  material  to  be  distinguishable  by  proposed  reading  machines  cannot  be 
greater  than  one-thirtieth  of  the  overall  width  of  the  widest  character.   ±_' 

Micropositioning  can  be  greatly  facilitated  by  the  use  of  relatively  gross  guide  marks  that  can 
either  be  preprinted  (where  the  originating  sources  are  under  administrative  control)  or  be  added  by 
a  pre-editing  operation.     It  is  significant  to  note  that  such  positioning  controls,   having  the  effect  of 
a  detent  mechanism  that  locks  the  scanning  precisely  to  the  area  to  be  scanned,   will  also  enable 
various  scanning  systems,   particularly  flying  spot  scanners,   to  be  used  at  maximum  acuity.     Thus, 
arbitrarily  improved  resolution  can  be  obtained.     Without  such  pre-positioning  controls,    micro- 
positioning  factors  to  be  determined  will  include  the  placement  of  the  body  of  the  information  with 
relation  to  the  edges  of  the  carrier  medium,   and  the  amount  and  consistency  of  leading  or  space 
between  lines. 

Factors  involving  both  system  requirements  and  micropositioning  in  the  sense  of  carrier 
handling  and  reader  scanning  relate  to  formatting  considerations  on  the  carrier,    such  as  the  use  of 
designated  fields  for  certain  types  of  information.     For  example,   on  an  inventory  control  document, 
alphanumeric  information  may  appear  in  a  field  reserved  for  stock  identification,   but  only  numeric 
data  should  appear  in  fields  used  for  quantity  issued,   unit  cost,   and  similar  data.     Several  fields  may 
occur  on  the  same  line,   with  only  one  to  be  read.     In  many  other  potential  applications,    information 
actually  appearing  may  be  required  to  be  omitted  from  the  processing,   or  unreadable  information, 
such  as  graphic  material,    may  be  interspersed  with  text.     In  a  prototype  page  reader  for  Russian 
text  under  development  at  Baird-Atomic,    fiduciary  marks  consisting  of  black  strips  of  variable  width 
are  used  to  signal  'begin-read',    'stop-read'  and  'begin-read-again-in-the- same -line1.     These  marks 
are  superimposed  on  the  original  page  copy  as  it  is  microfilmed,   the  reader  itself  being  designed  to 
scan  the  pages  as  they  are  reproduced  on  70  mm  film. 

Micropositioning  factors  relative  to  any  single  line  involve  questions  of  the  vertical  and 
horizontal  alignment  of  individual  characters  in  that  line.     With  regard  to  any  single  character, 
critical  positioning  factors  include  the  limits  of  rectilinear  translation  of  the  actual  character  within 
the  normal  character  space  and  the  limits  of  angular  displacement  or  skew  of  the  image  in  that 
character  space.     Improper  or  irregular  spacing,   for  example,   within  a  word,   is  found  in  some 
typewritten  material  and  quite  commonly  in  cursive  handwriting.     The  question  of  overlap  between 
characters  raises  serious  problems  in  scanning -recognition  systems  that  are  based  on  the 
utilization  of  a  blank  space  between  characters  to  trigger  a  character -scan  cycle.     Usually,   however, 
overlapping  and  bleeding  between  characters  occurs  only  where  a  fixed  character  space  is  used  for 
all  characters  in  the  line.     In  the  case-of  printing,   or  in  those  few  typewriting  devices  where 
proportional  spacing  is  used  for  characters  of  different  width,   overlapping  does  not  normally  occur. 

Once  the  carrier  medium,   line  of  information,   and  individual  character  image-unit  to  be 
scanned  have  been  positioned,   the  actual  scanning  requirements  again  involve  both  equipment 
characteristics  such  as  scan  rate,    resolution  of  scan,   and  illumination  of  the  area  to  be  scanned  on 
the  one  hand,   and  characteristics  of  the  information  and  its  carrier  on  the  other.     The  resolution 
obtainable  in  the  scanning  system  will  in  turn  relate  to  questions  of  the  total  amount  of  information 
that  can  be  used  as  a  scanning  unit  at  one  time  and  the  degree  of  magnification  of  the  image  field 
to  be  scanned  that  may  be  achieved  through  suitable  optics. 

Carrier  characteristics  affecting  the  performance  requirements  for  scanning  involve  the 
variety  of  types  of  carrier  media.     The  physical  qualities  of  the  carrier  medium  as  background  are 
particularly  important.     These  will  include  the  color,    opacity,    roughness  of  surface,   and 
reflectivity  of  the  carrier  material,   especially  paper.     For  example,   materials  such  as  wood  chips 
used  in  the  making  of  paper  may  cause  specks  and  shadows.     In  the  case  of  Russian  printed 
materials,   the  use  of  poor  quality  paper  may  be  aggravated  by  poor  quality  printing  so  that  crossbars 
of  certain  characters  may  be  missing  and  so  that  matrix  'hair  line'  tracings  are  included  in  the 
character  space.  _£/  For  typewritten  material,    roughness  of  surface  causes  uneven  distribution  of 


1/ 
2/ 


Cook,   H.  D.     "A  study  of  print  reading  systems,  "    Ref.    80. 

See  Boni,    C.     "Russian  type  study,  "    Ref.    55.     The  report  includes  tables  of  observed 
character  defects  by  type  of  defect.     However,   the  matrix  'hairlines',   the  "extraneous 
vertical  lines  between  characters  resulting  from  the  accumulation  of  unwanted  type 
metal  on  the  sides  of  the  matrices,  "  were  apparently  too  numerous  to  count. 
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ink  from  the  ribbon,  and  hence  imperfect  edges  of  typed  characters.     In  some  studies  _!/,    definite 
variations  in  paper  reflectivity  from  point  to  point  have  been  found,    such  that  design  requirements 
might  well  include  means  for  accurate  control  of  the  distances  between  light  sources  illuminating  the 
scanned  area,   the  paper,   and  the  means  for  signal  pickup.     In  all  cases,    consideration  of  carrier 
background  characteristics  should  include  evaluation  of  the  probabilities  of  superimposed  noise  from 
accidental  or  deliberate  over-marking. 

One  of  the  most  important  factors  in  establishing  performance  criteria  for  the  scanning  process 
is  the  contrast  between  the  inscription  and  the  background  provided  by  the  carrier  on  which  the 
information  to  be  read  is  inscribed.     This  contrast  is  typically  the  product  of  the  interaction  of 
carrier  qualities  and  the  physical  characteristics  of  the  actual  inscription.     The  signal-to-noise  ratio 
is  comprised  of  the  degree  of  contrast  between  the  inscription  and  its  background  (contrast  between 
black  ink  and  white  paper)  plus  both  random  and  superimposed  noise  (e.  g.  ,   from  over-marking)  in  the 
image  field  that  is  scanned.     In  printing  processes,   the  "weight"  (the  amount  of  ink  the  type  puts  on 
paper)  and  the  "color"  (the  density  of  ink  deposited  per  page)  of  a  given  type  face  directly  influence 
the  expected  contrast  factor  for  that  type  face.    2/  In  typewritten  material,   the  qualities  of  the  type- 
writer ribbon  compound  the  variations  available  from  the  paper,   the  conditions  of  type-surface 
cleanliness,   and  operator  touch.     For  example,   the  texture  of  the  ribbon  may  result  in  a  super-  _  / 
imposed  pattern  of  the  textile  threads  both  upon  characters  and  in  the  space  between  characters.—' 

Closely  related  to  questions  of  degree  of  contrast  and  of  variability  in  contrast  are  factors  of 
the  quality  of  the  line  edges  of  the  character  image,   including  the  phenomena  of  bleeding  and  filling  in 
of  characters  which  aggravate  the  problems  of  background  noise,   and  of  the  extent  to  which  portions 
of  the  character  are  likely  to  be  missing  or  incomplete.     In  the  previously  cited  study  of  Russian 
printing,    Boni  reports  that  only  5  of  341  pages  sampled  from  various  books  were  free  of  defective 
character  forms.     Some  of  the  conclusions  reached  in  this  study. were  as  follows: 

"...    printed  Russian  pages  are  very  uneven  in  color  of  image  and  weight  of  impression; 
type  imiiges  are  often  blurred  and  broken;  sometimes  there  is  complete  breakaway;  the  type 
does  not  make  contact  with  the  paper  and  there  is  no  image  at  all  .  .  .    The  alignment  of 
characters  sometimes  is  poor;  matrix  sidewalls  appear  to  break  down  from  time  to  time, 
producing  deformations  in  character  stems  and  bowls;  and  many  matrix  'hairlines' 
appear  between  characters.  "  Z./ 

A  final  major  consideration,    in  evaluating  the  probable  effectiveness  of  the  scanning  process  in 
an  automatic  reader  system,   involves  the  extent  of  information  content  of  the  character  image  field 
that  is  scanned  in  a  single  complete  scan  cycle,   and  the  degree  of  quantization,  or  unitization  of  this 
field  that  is  developed  in  the  scanning  process  to  produce  the  input  pattern.     The  information  content 
of  the  scanned  field  may  vary  from  that  located  anywhere  in  the  whole  area  of  the  carrier  item  (e.g.  , 
scan  of  an  entire  envelope  to  determine  where  the  stamp  is  located  in  automatic  facing  and  cancelling 
systems)  through  a  single  line  or  a  single  message  sub-unit,    such  as  a  word,    down  to  the  individual 
source  pattern  character  symbol  or  to  sub-sets  of  the  source  pattern  image,   and  even  to  points  or 
cells  in  the  image  field  which  may  be  considerably  smaller  than  character  elements  such  as  strokes. 
The  character  image  field,    whatever  the  extent  of  its  information  content,    may  either  be  traversed 
continuously  or  it  may  be  selectively  sampled  at  particular  points.     Factors  involving  the  extent  to 
which  a  source  pattern  can  be  broken  into  a  number  of  sub-patterns  of  elements  in  the  scanning 
process  are  closely  related  to  the  particular  recognition  logic  employed.     Similar  factors  involving 
the  extent  to  which  the  results  of  any  one  scan  in  the  cycle  are  stored  for  comparison  with  subsequent 
scans  also  interact  with  the  recognition  logic. 

As  a  character  image  field  is  being  scanned,    or  after  the  results  of  scan  have  been  sensed,   a 
variety  of  processes  can  be  used  to  improve  the  derived  image.     These  include,    as  we  have 
previously  noted,   various  means  for  contrast  enhancement  such  as  comparing  the  integrated  blackness 
of  a  scan  area  with  a  prescribed  threshold  value,   and  readjusting  such  thresholds  in  accordance  with 
integrated  values  derived  from  preceding  scans  in  the  same  cycle.  _? /Similarly,    results  of  early  scans 
in  a  cycle  may  be  used  to  re-position  the  scanner  mechanism  for  more  exact  register.     For  example, 
the  Solartron  ERA  optical  reading  system  has  the  following  source-to-input  pattern  processing 
features: 


y 
1/ 

i/ 

i/ 
5/ 


See  Carlson,    CO.     "Preliminary  investigations  on.  direct  symbol  recognition,"  Ref.    68. 

See  Karch,    R.R.     "How  to  recognize  type  faces,"    Ref.    245. 

Cook,   H.  D.     "A  study  of  print  reading  systems,  "    Ref.    80. 

Boni,    C.     "Russian  type  study,  "    Ref.    55,    pp2-16,    3-4,    3-5. 

See  for  example,    Greanias,    E.C.  ,   Hoppel,    et  al.     "Design  of  logic  for  recognition  of  printed 
characters  by  simulation,"    Ref.    175. 
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"Each  character  is  submitted  in  general  to  three  scans  of  'frames'  with  vertical  lines. 
The  first  of  these  records  the  density  of  the  character,   and  operates  a  clamped  control  to 
ensure  that  the  two  following  frames  have  favorable  black/white  contrast,   differentiating 
the  true  character  from  the  smudges  or  'halo'  which  surround  it.     The  character  is  then 
provided  with  a  fairly  clean  black  edge  for  the  second  frame,    which  covers  the  same  area 
and  measures  the  extreme  black  edge  at  either  side,   top  and  bottom  of  the  character.     A 
clamped  control  derived  from  this  is  applied  to  centralize  the  character,   which  it  does  to 
within  +  1/2  element  vertically  and  horizontally  during  the  third  frame.     The  third  frame, 
with  nearly  correct  contrast  and  registration,   is  used  for  reading.  "    }J 

Performance  requirements  in  the  area  of  input  pattern  improvement  will  therefore  relate  to  the 
variability  of  contrast  conditions  in  input  material,    expected  variations  in  source  pattern  sharpness 
and  completeness,    expected  incidence  of  source  pattern  mis-register,   and  the  like. 

In  evaluation  of  the  effectiveness  of  the  recognition  processes,   a  major  operational  considera- 
tion is  the  total  size  of  the  character  vocabulary  that  can  be  accommodated.     Vocabulary  size 
consideration  will  include  questions  of  whether  varying  sets  of  characters  (i.  e.  ,   upper  and  lower 
case,   character  sets  of  similar  but  different  type  style,    character  sets  of  different  sizes  within  the 
same  series  of  type,    character  sets  of  different  weights  within  the  same  family  of  type,   etc.)  can  be 
recognized.     The  number  of  fonts  to  be  encountered  in  input  material  and  the  type  style  of  particular 
fonts  determine  the  similarity  of  the  characters  that  must  be  distinguished  from  each  other  and  thus 
set  requirements  as  to  vocabulary  size,   precision  of  recognition,   or  both.     Allowable  tolerances  for 
nonrecognition  of  ambiguous  characters  or  for  misrecognition  will  obviously  influence  the  required 
complexity  of  the  recognition  logic. 

Where  several  different  fonts  are  to  be  recognized,   it  may  be  necessary  that  the  recognition 
process  include  determination  of  which  of  several  sets  a  given  input  character  belongs  so  as  to  adjust 
the  logic  appropriately.     Figure   15,    (p.    43)  illustrates  some  of  the  terms  that  are  used  in  identifying 
different  type  style  characteristics.     Human  operators  can  distinguish  one  given  type  style  from 
another  by  such  features  as  the  shape  of  the  serifs,   width  of  characters,   weight,    contrast  of  heavy 
and  light  strokes,    design  of  cross  strokes,   length  of  descenders,   length  of  ascenders,   and  design  of 
the  lower  case  "g".    "hJ   Machine  ability  to  discriminate  between  such  features  may  therefore  be 
required  in  a    universal-type  character  reader.     Differences  in  size  in  the  same  or  different  type 
styles  that  must  be  accommodated  will  influence  machine  characteristics  in  the  required  resolution 
of  scan,   in  adequacy  of  means  to  standardize  characters  to  a  definite  size  during  image  improvement, 
or  in  the  required  versatility  of  the  recognition  logic.     Where  it  is  required  that  a  reader  system  be 
capable  of  handling  dissimilar  fonts  with  a  large  total  vocabulary,   evaluation  will  necessarily  include 
consideration  of  the  ease  with  which  the  reader  can  be  adjusted  to  handle   each  different  font,   e.g.  ,   by 
insertion  of  a  new  plugboard  or  a  new  set  of  vocabulary  masks,   or  by  self-adjusting  processes. 

Allowable  tolerances  for  errors  and  rejects  will  determine  the  extent  of  the  capabilities  for 
re-run  and  re-setting  of  both  scanning  and  recognition  thresholds  that  will  be  required.     Finally,   the 
specific  requirements  for  output  operations  will  include  the  means  for  subsequent  handling  of  carrier 
items  if  they  are  to  be  sorted  or  routed  as  a  product  of  the  recognition  operation,   the  ease  with 
which  varied  output  formats  may  be  obtained,   provisions  for  editing  such  as  spacing  and  tabular 
alignment  of  output,   machine  output  characteristics  such  as  punching  or  printing  rate,   and,   where 
required,   the  provision  of  means  for  automatic  checking  of  output  accuracy.     Such  output 
considerations  will  in  turn  influence  other  factors.     For  example,    if  the  output  requirement  is  to 
punch  cards,   a  punching  rate  of  100  cards  per  minute  would  clearly  indicate  that  the  character 
recognition  rate  need  not  exceed  8,000  characters  per  minute,   or  approximately  130  characters  per 
second. 

5.  3    Special  Difficulties  with  Typewritten  Material 

We  have  already  mentioned  some  of  the  special  difficulties  that  are  encountered  in  automatic 
reading  of  typewritten  material,    such  as  the  variables  introduced  by  characteristics  of  the  ribbon  and 
by  differences  in  operator  touch.     However,    in  many  potential  applications  of  automatic  reading 
techniques,   a  significantly  large  portion  of  material  to  be  transcribed  to  machine -usable  form 
consists  of  information  typed  on  ordinary  typewriters  at  a  variety  of  originating  sources.     For  this 
reason,    some  of  the  critical  factors  likely  to  be  involved  in  successful  reading  of  typed  material  will 
be  discussed  in  greater  detail. 

To  explore  some  of  the  difficulties  inherent  in  the  recognition  of  typewritten  material,    sample 
upper  and  lower  case  alphabets  were  typed  on  a  variety  of  typewriters  that  were  in  use  on  the  same 


— '  Bailey,    C.E.G.     "Introductory  lecture  on  character  recognition,"  Ref.    31,    p.    445. 

— '         Karch,    R.R.     "How  to  recognize  type  faces,"    Ref.    245. 


75 


day  in  several  different  offices.     Ten  different  models  of  typewriters  of  various  makes  were  sampled, 
and  in  two  cases-  samples  were  obtained  from  two  different  typewriters  of  the  same  make  and  model. 
This  method  of  sampling  was  deliberately  chosen  to  obtain  results  representative  of  the  output  of  a 
Government  office,   on  a  typical  day,   without  prior  warning  or  opportunity  to  clean  the  machines  or  to 
change  ribbons.     Some  of  the  results  are  shown  in  Figures   17  through  21.     Figure   17  shows  the  name 
of  the  manufacturer  and  model  as  typed  on  each  respective  machine,   for  the  12  typewriters  from 
which  samples  were  taken.     Figure  17  also  illustrates  the  effect  of  proportional  spacing  and  use  of 
carbon  paper  ribbon  in  the  IBM  Executive  and  the  effect  of  a  typist  strike-over,   in  the  upper 
Remington  Standard. 

In  all  except  two  cases,   the  IBM  Executive  and  the  upper  Remington  Electric  (sans  serif  type 
face),   there  was  a  noticeable  degree  of  overlap  between  characters,   or  of  bleeding  from  one 
character  to  another.     Note,   for  example,    "RM"  in  Hermes,    "IB"  in  IBM  Electromatic,    "NT"  in  L.C. 
Smith  Model  Seventeen,    "RE"  in  Remington  Standard,   and  "AN"  in  Royal  Standard,   as  well  as  the 
examples  shown  in  Figures   18  (b),    19,    20  (a),    20  (b),   and  21.     Figures   18  (a)  and  18  (b)  illustrate 
differences  between  two  machines  of  the  same  make  and  model  (Royal  Standard).     This  is  to  be 
expected,   of  course,    since  the  idiosyncracies  of  individual  machines  have  been  used  to  identify  the 
machine  on  which  a  particular  document  has  been  typed.     These  variations,   however,    may  also 
cause  serious  difficulties  in  automatic  recognition.     Note  that  in  Figure  18  (b)  the  letters  "A"  and 
"B"  overlap,   whereas  in  Figure   18  (a)  they  do  not.     The  apex  of  the  "A"  in  Figure   18  (b)  has  a 
noticeable  hole,    such  that  this  character  has  two  enclosed  white  areas  while  the  "A"  in  Figure   18  (a) 
has  only  one.     There  is  a  small  but  complete  break  across  the  vertical  stroke  of  the  "B"  in 
Figure   18  (a)  such  that  a  scan  centered  on  this  stroke  might  count  one  long  continuous  black  blob  for 
"B"  in  Figure  18  (b)  but  two  blobs  in  the  "B"  of  18  (a). 

Figure   19  again  shows  overlap  or  bleeding  in  upper  case  letters  from  one  of  the  Remington 
Standard  samples,   together  with  broken  strokes  and  missing  portions  of  letters.     In  Figure  20  (a) 
and  (b),   we  show  two  lines  typed  with  the  same  IBM  Electromatic  typewriter  to  demonstrate  the 
effects  (Figure  20,     example  b)    of  overlap  from  an  upper  line  where  only  a  half  space  between  lines 
was  used.     Particularly  noteworthy  are  the  differences  in  the  character  impressions  resulting  from 
one  or  more  variable  factors  such  as  key  pressure  or  from  use  of  a  different  section  of  the  ribbon  or 
different  paper  characteristics  at  a  different  location  on  the  page.     Figure  21  illustrates  noisy 
characters,   bleeding,   and  extensive  running  overlap  between  lower  case  characters  from  the  L.C. 
Smith  Model  Seventeen  sample. 

Variations  in  operator  touch,    condition  of  the  type  keys,    paper,   and  ribbon  all  affect 
character  quality.     Such  variations  result  in  non-uniform  character  shapes  because  of  uneven 
distribution  of  ink,    broken  strokes  and  missing  portions  of  characters,    bleeding,   filling,   overlap, 
and  non-uniform  positioning  and  spacing.     Wide  discrepancies  are  found  for  typed  characters  even 
when  typed  on  the  same  page  and  by  the  same  typewriter.     For  example,    Rabinow  found  that  success- 
ive imprints  of  the  same  character  on  the  same  page  varied  in  reflectivity  by  as  much  as  30  per 
cent.   J/ 

Characteristics  of  the  paper  and  the  paper  backing,    such  as  the  use  of  short  or  long  fibres, 
affect  the  amount  of  indentation  that  occurs  when  the  type  strikes  the  paper  and  thus  determine  the 
quality  of  the  character  edge. 

Even  when  of  the  same  make  and  grade,   typewriter  ribbons  of  cotton,    silk,   or  nylon  give 
variable  performance  on  wear-down  tests  which  measure  the  relative  deterioration  of  the  ink 
density  after  repeated  typing  over  the  same  portion  of  the  ribbon.     Federal  specifications  require 
that  an  acceptable  ribbon  should  show  no  filling  in  the  typed  character  after  800  typings  of  lower 
case  "e",   but  such  filling  from  dirty  keys  as  well  as  from  inferior  ribbons  or  ribbons  long  in  use 
must  be  expected  in  average  output  of  typed  material. 

5.  4    Difficulties  with  Other  Types  of  Special  Material 

Two  other  categories  of  material  which  may  generate  specialized  difficulties  for  automatic 
reading  that  are  of  special  interest  are  the  case  of  printed  text  in  other  than  Roman  script, 
especially  foreign  language  material,   and  the  case  of  stencilled  information.     The  special  problems 
of  ideographic  languages,    such  as  Chinese,    where  the  character  itself  is  often  the  basic  semantic 
unit  of  the  language,    will  not  be  considered  here,    since  such  problems  are  almost  as  much  a  matter 
of  mechanized  translation  as  they  are  of  mechanized  character  reading. 

A  requirement  that  a  reader  system  be  capable  of  handling  varied  foreign  language  material 
first  of  all  implies  the  need  for  a  relatively  larger  vocabulary  since  languages  such  as  Russian, 
Arabic  and  the  like,   use  alphabets  containing  character  shapes  not  found  in  the  English  "(Latin) 
alphabet.     Secondly,   in  many  highly  inflected  languages  extensive  use  of  diacritical  marks  provides, 


-'  Rabinow,    J.     "Report  on  DOFL  First  Reader",    Ref.    367. 
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Figure    17.     Example  of  Samples  Obtained  from  Different  Typewriters 
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in  effect,   two  or  more  versions  of  various  character  shapes.     Since  such  marks  may  be  relatively- 
small  they  add  to  the  problems  of  distinguishing  one  character  from  another.     They  add  to  the 
problems  of  discriminating  between  valid  marks  of  small  size  on  the  one  hand  and  general  background 
noise  on  the  other. 

Some  of  these  difficulties  are  illustrated  in  the  case  of  Cyrillic  script  where,   for  example,   the 
Russian  letters  "shah"  and  "shchah"  (see  Figure   13)  are  distinguishable  only  by  a  small  difference  in 
total  area  and  shape,   and  where  certain  letters  can  be  distinguished  from  each  other  only  by  the 
diacritical  marks.     The  confusion  of  these  letters  with  each  other  might  well  result  in  added 
ambiguity  of  subsequent  interpretation,    either  in  an  automatic  dictionary  or  in  an  automatic  trans- 
lation application.     For  a  simple  example,   the  use  of  "short  ee"  rather  than  "ee"  in  Russian  may 
distinguish  between  the  single  or  plural  number  of  certain  masculine  nouns  with  soft  endings. 
Similarly,   the  difference  between  Russian  "yeh"  and  "yio"  may  discriminate  between  the  instrumental 
and  prepositional  cases  of  the  interrogative  pronoun  meaning  "by  what"  or  "about  what". 

As  has  been  previously  noted,   a  study  of  Russian  printing  quality  has  been  conducted  by  New 
York  University  for  the  Rome  Air  Development  Center,    in  connection  with  a  program  looking  toward 
mechanized  translation.     In  the  report  on  the  results  of  this  study,   a  number  of  difficulties  are 
stressed.   ±f     These  difficulties  include  variable  spacing,   both  narrow  and  wide,    either  to  fill  out  a 
short  line  or  to  provide  an  effect  equivalent  to  the  use  of  italics,   as  well  as  a  wide  variety  of 
character  deformations  and  the  close  similarities  between  several  different  characters.     The  effect 
of  operational  requirements  for  applications  involving  automatic  reading  of  Russian  print,   therefore, 
is  directly  related  to  system  design  considerations,   as  the  NYU  researchers  point  out.     They 
conclude  that  the  poor  quality  "militates  against  character  discrimination  through  observation  of  fine 
or  discursive  detail";  that  methods  of  discrimination  based  upon  relatively  invariant  gross 
characteristics  and  fundamental  structure  "offer  a  greater  chance  of  success",   but  that  a  great  many 
'bits'  of  information  may  be  required  for  recognition  purposes  if  frequently  mutilated  characters  are 
not  to  be  mistaken  for  their  'homomorphs1.    ~J 

In  the  case  of  stencilled  information,   the  necessary  vocabulary  is  relatively  limited  in  size, 
since  normally  only  numerals  and  alphabetic  upper  case  characters  are  used,   but  almost  all 
characters  have  arbitrarily  broken  strokes  such  that  considerable  increase  in  recognition  logic 
complexity  may  be  required  to  eliminate  ambiguous  or  erroneous  identifications. 

5.  5    Criteria  for  Performance  Measurement 


The  critical  factors  in  automatic  reading  problems  include  not  only  the  overall  and  specific 
performance  requirements  and  the  difficulties  inherent  in  the  scanning -recognition  of  printed,   typed, 
stencilled  materials,   but  also  the  selection  and  use  of  appropriate  criteria  for  measuring  the 
performance  of  equipment  to  be  evaluated. 

The  obvious  standard  of  reference  is  of  course  comparison  of  reader  performance  with  human 
performance  in  scanning  and  recognition  of  similar  material.     Yet  the  adoption  of  such  a  criterion 
would  create  great  difficulties  and  would  in  most  cases  be  misleading.     In  the  first  place,   human 
performance  is  the  product  of  many  factors  other  than  visual  acuity,    specifically  including 
psychological  "set"  and  expectancy  based  upon  prior  context.     For  example,   in  a  series  of  psycho- 
logical experiments,   the  reproduction  of  ambiguous  stimulus  figures  is  definitely  influenced  by 
concurrent  stimuli,    such  as  a  suggestive  word,   i.e.  ,    the  "recognition"  of  a  serpentine  curve,    like 
a  reversed  "S",   was  as  "8"  in  the  case  where  the  word,    "eight"  was  concurrently  given  and  as  "2" 
where  "two"  was  given.    3/  Again,    such  hypotheses  as  the  principles  of  "closure"  and  "good 
continuation"  of  Gestalt  psychology  may  be  the  basis  for  human  ability  to  fill  in  holes  and  breaks  in  a 
character  image  of  poor  quality. 

In  the  second  place,   human  errors  result  from  such  causes  as  inattention,   too  great  a  span  of 
perception  as  in  copying  from  two  different  lines,   and  mistaken  expectancy  as  well  as  from  mis- 
readings  of  poor  quality  or  ambiguous  material  or  material  read  under  inadequate  lighting  conditions. 
A  variety  of  human  errors  may  also  occur  in  the  transcription  part  of  the  process  such  as  inversion 
of  two  or  more  characters,    striking  of  the  wrong  key,    skipping  a  line,   and  the  like.     Thus  human 
errors  in  a  data-copying  or  transcription  process  may  be  highly  erratic,    difficult  to  check  without 
100%  verification.     The  errors  to  be  expected  in  automatic  reading  equipment,   however,   tend  to  be 
systematic  (an  "0"  for  a"Q",   and  an  "8"  for  a  "B")  and  can  therefore  be  subjected,    if  desired,   to 
equally  systematic  error -detecting  and  possibly  error-correcting  processes.     Moreover,    in 


1/ 

y 

2/ 


Boni,    C.     "Russian  type  study,  "    Ref.    55. 

Ref.    55,   pp  1-6,    3-1,    3-2. 

Carmichael,    L.  ,   H.  P.  Hogan,   and  A.  A.  Walter.     "An  experimental  study  of  the  effect  of 
language  on  the  reproducibility  of  visually  perceived  form,  "  Ref.  69-     See  also  Stevens,   S. 
"Handbook  of  experimental  psychology",    Ref.    463. 
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many  cases,   reject  thresholds  can  be  set  sufficiently  high  as  to  reject  as  nonre cognitions  all 
characters  ambiguous  enough  to  make  mis  recognition  likely. 

The  criterion  of  intercomparison  of  human  and  machine  performance  should  therefore  be  limited 
to  the  comparative  costs,   comparative  time  figures,   and  comparative  error  rates  on  the  same  or 
similar  material  for  manual  and  for  automatic  reading -transcription.     In  our  1957  survey  of  character 
recognition  we  gave  as  an  example  an  automatic  reader  device  capable  of  recognizing  the  upper  case 
vocabulary  of  machine  tabulator  output  at  a  rate  of  only  15  characters  per  second,   with  an  error  rate 
not  greater  than  one  character  per  thousand,   and  costing  $50,  000  as  capital  investment.     Such  equip- 
ment,  with  performance  specifications  considerably  below  those  now  obtainable,   could  provide  work- 
output  equivalent  to  that  of  5  GS-3  keypunch  operators  copying  at  a  steady  rate  of  three  characters  per 
second  for  an  eight-hour  shift.     The  cost  of  these  keypunch  operators,    in  direct  salaries  alone,   would 
be  at  least  $15,  000  per  year.     Assuming  that  the  costs  of  operating  the  reader  equipment  are  directly 
comparable  to  the  recruitment,   training,    replacement,    supervision,   leave  reserve,   and  overhead 
costs,   plus  rental  of  keypunch  equipment  for  the  manual  operators,    such  a  device  could  be  amortized 
in  about  seven  years.     If,   under  similar  circumstances,   the  reader  can  recognize   100  characters  per 
second,   a  machine  costing  $250,  000  could  be  amortized  in  five  years  or  less,   and  its  use  might  well 
be  justified  on  the  basis  of  economy  alone. 

Our  earlier  conclusions  have  been  reaffirmed  in  an  independent  study  — '    of  the  use  of  optical 
scanning  equipment  for  the  oil  industry,    conducted  in  November  1959.     In  this  study,   it  was 
determined  that  the  break-even  point  is  where  four  to  six  keypunch  operators  would  be  involved  on  a 
particular  application.     This  later  estimate  is  based  upon  the  assumption  that  there  are  7,  000  strokes 
per  hour  per  operator  (a  2  per  second  average).     For  a  Farrington-IMR  installation  at  the  Atlantic 
City  Electric  Company  it  has  been  reported  that  the  reader  does  the  "work  of  8  tape-writer  operators 
in  l/20  the  time,  "  with  a  2%  reject  rate,   and  with  no  errors  as  yet  discovered.     .£/ 

Similar  conclusions  may  be  reached  by  considering  costs  per  word  read.     At  a  Seminar  on 
Machine  Indexing  held  at  American  University  in  February  1961,    several  speakers  agreed  that  the 
average  cost  of  keypunching  natural  language  text  material  is  approximately  2.  5  cents  per  word, 
including  verification.     At  such  a  rate,   the  keypunching  of  just  the  text  of  the  Harvard  Classics, 
assuming  100  volumes  with  an  average  of  100,  000  words,   would  cost  more  than  twice  as  much  as  the 
quoted  development  costs  for  page-reader  recognition  system  proposals  from  several  different 
potential  suppliers.     When  that  famous  "five-foot  shelf"  is  considered  in  relation  to  the  collections  of 
printed  literature  that  would  be  involved  in  mechanized  translation  operations,   the  need  for  automatic 
character  recognition  devices  is  quite  obvious.     Criteria  for  performance  specifications  and  for 
performance  measurement  with  respect  to-  such  potential  applications  should  therefore  be  appraised 
realistically  in  terms  of  the  need  and  of  the  comparative  costs. 

A  second  criterion  for  both  the  initial  design  of  automatic  reading  equipment  and  its  subsequent 
performance  evaluation  is  the  tolerance  allowed  in  the  system  for  character  degradation  or 
deterioration.     In  studies  made  at  IBM,   Greanias  and  Hill  — '    defined  two  major     factors  that  might  be 
used  in  the  establishment  of  a  criterion  to  test  various  proposed  devices.     These  two  factors  are: 

(1)  Permissible  character  deterioration,    determined  by  measurement  and  integration  of  total 
detectable  ink  density  of  an  actual  symbol  over  the  entire  symbol  space  as  against  the  total 
density  of  an  ideal  image  of  the  same  symbol;  and 

(2)  A  style  factor  which  is  a  function  of  the  number  of  styles,   the  number  of  identifications 
(size  of  total  symbol  vocabulary),   and  the  similarity  of  symbols  that  do  not  have  the 
same  identification.     More  precisely,   the  style  factor  measures  the  dissimilarity  of 
different  symbols  in  terms  of  the  total  area  of  a  given  symbol,   that  area  of  that  symbol 
which  is  not  part  of  a  different  symbol,   and  the  area  of  the  different  symbol  that  is  not 
common  with  that  of  the  first. 

Character  deterioration,   as  related  to  maintenance  of  consistent  good  quality  input,    is  a  key 
factor  in  determining  the  permissible  reject  rates  that  must  be  balanced  against  needs  and  compara- 
tive costs.     These  rates  can  be,   in  fact,   far  higher  than  might  be  supposed.     Table  II  illustrates  some 
of  the  relationships  between  economic  break-even  points  and  reject  rates,   on  the  basis  of  the  following 
assumptions: 


1/ 

1/ 
3/ 


As  reported  by  Keller,  A.  E.     "Optical  scanning  -  an  unlimited  horizon",   Ref.  249,   p.   25. 

See  Vogler,    G.  W.     "Optical  scanning  of  customer  accounts,  "    Ref.    518,   p.    26. 

Greanias,   E.C.  ,   and  Hill,   Y.N.     "Considerations  in  the  design  of  character  recognition 
devices, "    Ref.    172. 
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(1)  Salaries  of  keypunch  operators  are  at  the  rate  of  $4,  000  per  annum; 

(2)  The  costs  of  one  or  two  operators  for  the  reader,   power  consumption,   maintenance, 
and  the  like,   are  assumed  to  be  more  than  balanced  by  costs  of  supervision,   overhead, 
training  of  operators,    not  included  in  potential  savings; 

(3)  The  reject  rate  is  assumed  to  be  for  documents,   not  characters;   and 

(4)  The  recognition  rate  of  the  reader  is  sufficient  to  replace  the  operators  shown, 
assuming  that  they  work  steadily  at  3  characters  per  second.  _1/ 

As  a  safety  factor,   the  reject  rates  indicated  in  Table  II  might  well  be  cut  in  half,   and  there 
would  still  be  good  reason  to  believe  that  a  $300,  000  reader  with  a  document  reject  rate  of  35%  would 
produce  direct  cost  savings  in  situations  where  the  workload  requires  50  or  more  keypunch  operators. 
Similarly,   a  reader  costing  only  $100,  000  as  original  investment  might  be  applied,    with  direct  savings, 
in  situations  requiring  only  10  operators,   provided  that  the  document  reject  rate  is  no  more  than  20%. 

Related  to  the  question  of  reject  rates  that  can  be  accepted,   questions  of  performance  measure- 
ment with  respect  to  a  specific  proposed  character  recognition  system  would  also  include  the  ease 
■with  which  reject  reinstatements  might  be  accomplished.     For  example,   in  several  reader  systems 
under  development  provision  is  made  for  display  of  characters  that  have  not  been  recognized, 
together  with  provision  for  manual  insertion  of  the  correct  character  identification.     Thus  the 
question  of  performance  measurement  with  respect  to  rejects  and  errors  will  involve  considerations 
of  whether,   by  accepting  relatively  high  reject  rates,   the  possibilities  of  true  misrecognitions  can  be 
held  to  a  negligible  level,   of  the  ease  with  which  nonrecognitions  can  be  corrected,   and  of  the 
possibilities  of  combining  manual  and  machine  techniques  in  the  total  operation. 

A  final  area  in  which  the  development,    selection,   and  use  of  appropriate  criteria  for  perform- 
ance measurement  of  reader  devices  is  needed  is  that  of  standardization  of  test  methods  and  test 
materials,   including  methods  of. measuring  visibility  and  legibility  of  printed  and  typed  characters. 
Some  studies  have  been  made  _/ of  relative  readability  of  typed  and  printed  materials  in  various  sizes 
and  styles  for  human  readers.     However,   until  quite  recently  no  instruments  have  been  available  to 
obtain  quantitative  data  on  such  factors  as  the  actual  ink  density  of  characters  on  a  page.     For 
example,    present  Federal  specifications  for  typewriter  ribbons  use  test  procedures  where  subjective 
judgments  must  be  made  by  human  observers.     They  determine  the  deterioration  of  the  intensity  of 
typed  characters  after  repeated  typing  on  the  same  portion  of  the  ribbon  and  after  controlled 
exposure  to  simulated  fading  conditions  to  check  relative  permanence.   The  use  of  a  color  densito- 
meter 3/ has  also  been  developed  such  that  the  actual  reflectivity  from  a  character  space  may  be 
measured  precisely. 

In  view  of  the  interdependence  of  such  variables  as  paper,    ribbon,   type  style,   and  operator 
touch  in  producing  discrepancies  in  character  impressions  in  typed  material,   it  is  clear  that  consider- 
able research  would  be  necessary  to  prepare  reproducible  standard  material  for  test  of  readers 
designed  to  read  typewriter  output. 

Development  of  improved  methods  of  measurement  of  character  quality  for  both  printed  and 
typed  characters,  for  a  variety  of  sizes  and  styles,  a  variety  of  paper  stocks,  and  other  variable 
characteristics  of  the  carrier  medium  or  the  inscription  would  accomplish  the  following  objectives: 

1.  To  provide  quantitative  data  as  to  the  conditions  to  be  encountered  for  given  situations  in 
which  reader  devices  are  to  be  developed  or  applied; 

2.  To  provide  standards  for  controls  on  the  preparation  of  input  materials,    such  as  the 
maximum  usable  life  of  ribbon,    the  minimum  acceptable  quality  of  paper,   and  acceptable 
color  of  paper;  and 

3.       To  provide  means  for  calibration  of  standard  test  materials. 

Even  more  directly,    such  improved  methods  would  facilitate  acceptance  and  quality  control  of  items 
such  as  paper,    ribbons,    typefaces,    etc.  ,   that  should  be  of  considerable  interest  to  large  organizations 
such  as  the  military  supply  agencies. 


1/ 

2/ 


That  is,  for  100  operators,  the  reader  should  have  a  recognition  rate  of  not  less  than  300 
characters  per  second.  This  is  well  within  the  range  of  recognition  rates  that  have  been 
demonstrated. 

See  Greene,   E.  B.     Refs.    176,    177  and  Luckiesh,   M. ,   andF.K.    Moss.     Ref.   280. 

Sweet,    M.  H.     "An  improved  photomultiplier  tube  color  densitometer,  "    Ref.    468. 
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TABLE  II.     RELATIONSHIPS  BETWEEN  ECONOMIC  BREAK-EVEN 
POINTS  AND  REJECT  RATES 


Original  „ 

Workload  §a™gs' 

-.          .  Operator 

Operators  s^larie6 


Operators  Costs,    5-Year  Amortization  Plus 

Retained  Salaries  of  Operators  for  Rejects      Reject 

for  "Rate 

Rejects  $100,000  Reader  $300,000  Reader 


6 

$24, 000 

1 

t7 a   nnn 

., . mc  i 

7 

$28,000 
$28,000 

1 
2 

$24,000 

10%  + 

7 

$28,000 

25%  + 

8 

$32,000 
$32,000 

2 
3 

$28,000 

25% 

8 

$32,000 

30%  + 

10 
10 
10 


$40, 000 
$40, 000 
$40,000 


$28,000 
$36,000 
$40,000 


20% 
40% 
50% 


20 

$80,000 

4 

$36,000 

$76,000 

20% 

20 

$80,000 
$80,000 
$80,000 

8 
12 
15 

$52,000 
$68,000 
$80,000 

40% 

20 

60% 

20 

75% 

50 

$200,000 

20 

$100,000 

$140,000 

40% 

50 

$200,000 

30 

$140,000 

$180,000 

60% 

50 

$200,000 

35 

$160,000 

$200, 000 

70% 

50 

$200, 000 

45 

$200,000 

90% 

100 

$400, 000 

40 

$180,000 

$220,000 

40% 

100 

$400, 000 

60 

$260,000 

$300,000 

60% 

100 

$400,000 

80 

$340,000 

$380,000 

80% 

100 

$400,000 

90 

$380,000 

90% 
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6.     DESIGN  CHARACTERISTICS  OF  SELECTED  SYSTEMS 

Assuming  that  overall  system  objectives  and  specific  operational  requirements  have  been 
determined  and  that  performance  specifications  have  been  established,   the  potential  customer  for 
an  automatic  character  reader  would  like  to  make  a  comparative  evaluation  of  specific  systems.     In 
the  present  state  of  the  art,   however,    such  a  customer  would  be  largely  limited  to  design  details 
rather  than  to  operating  factors  as  experienced  in  practical  working  situations.     By  analogy,   we 
might  say  that  a  number  of  American  and  foreign  automotive  vehicle  prototypes  are  available,   but 
that  very  few  people  have  as  yet  driven  them  under  realistic  road  conditions. 

The  examples  of  reader  systems  that  have  been  cited  in  the  discussions  of  the  generalized 
character  recognition  process  clearly  show  that  a  variety  of  techniques  can  be  combined  to  achieve  the 
objectives  of  automatic  reading  of  printed,   typed,   and,   in  some  special  cases,   handwritten, 
characters.     Consideration  of  certain  comparative  characteristics  of  selected  systems  that  have  been 
proposed  should  further  demonstrate  that  the  automatic  character  recognition  problem  has  been 
solved,   in  principle,   and  in  a  variety  of  different  ways,   for  the  closed  vocabulary,   high  quality  input, 
situations.     Unfortunately,   however,   performance  data  on  operational  systems  is  largely  lacking. 
With  the  exception  of  Farrington-IMR  installations,   there  is  almost  no  customer  experience  to  report. 
Deliveries  have  also  been  made  in  recent  months  by,   for  example,   Solartron  and  Rabinow  Engineering, 
but  there  is  as  yet  insufficient  information  on  operations  of  these  systems.     Therefore,   the  inter- 
comparison  even  of  a  relatively  small  number  of  systems  must  be  largely  limited  to  system  design 
characteristics.     These  include  mechanisms  for  scanning  and  transformation  of  the  source  pattern, 
principles  of  recognition  logic  processing  that  have  been  adopted,   and  bases  for  recognition- 
identification  decisions.     We  shall  consider  certain  representative  systems  with  respect  to  each  of 
these  categories  as  well  as  with  respect  to  a  summary  or  general  classification. 

6.  1    Scanning  and  Transformations  of  Source  Patterns 

An  earlier  classification  of  various  possibilities  for  automatic  character  recognition  systems  ' 
was  directed  precisely  to    differences  in  scanning,    sensing,   and  source-to-input-pattern  transforma- 
tions.    Thus,    Cook  _1/   in  an  earlier  NBS  study  of  print  reading  systems  defined  three  major 
categories  as  follows: 

(1)  Optical  Matching.    In  such  systems  the  complete  input  image  in  two  dimensional  form  is 
compared  directly  with  two-dimensional  reference  images  stored  in  the  reader,   with  either  direct  or 
"best-fit-correlation"  identification. 

(2)  Scan  Line  Analysis.  In  these  systems  the  input  image  is  scanned  with  a  continuously 
moving  spot  so  as  to  form  a  line  pattern  over  the  image  field  and  produce  a  time -varying  signal 
which  is  then  compared  with  a  similar  signal  stored  for  each  of  the  vocabulary  reference  characters. 

(3)  Image  Point  Analysis.     In  such  systems,   the  input  image  is  analyzed  point  by  point  over  the 
entire  image  field  in  accordance  with  a  specific  pattern,   and  the  values  obtaining  from  examining 
each  point  are  processed  digitally  to  determine  the  characteristics  of  the  input  image. 

Frequently,  however,   variations  and  combinations  of  these  or  other  categories  occur  in 
proposed    devices  as,    for  example,    in  optical  matching  of  selected  image  points.     Moreover,   the 
use  of  specialized  transformations,    for  example,   by  techniques  of  autocorrelation,    provides  systems 
that  are  difficult  to  classify  in  precisely  these  terms. 

Obvious  differences  between  various  systems  do  continue  to  appear  with  respect  to  design 
characteristics  involving  holistic,   analytic,   and  criterial  feature  extraction  treatments  of  the  source 
pattern.     In  the  holistic  approaches,   the  source  pattern  as  a  whole  is  used  as  the  input  pattern,   the 
most  obvious  example  being  that  of  optical  reflectance  through  an  identical-shape,   identical-area 
mask.     In  most  cases,   however,   the  scan-input  operation  serves  as  a  transducer:    that  is,   it  con- 
verts the  two-dimensional  color-contrast  source  pattern  into  some  other  form.     For  example,   the 
original  black-white  contrasts  may  be  converted  into  contrasts  of  positive  or  negative  voltage  suit- 
able for  further  processing  in  an  analytic  treatment. 

Several  different  transducing  steps  may  also  be  involved.     For  example,   a  first  transformation 
to  signals  derived  from  some  conventional  means  of  scanning,   followed  by  a  procedure  for  encoding 
the  significant  features  of  these  signals.     Thus  Pahl,    considering  the  possibilities  for  criterial  line- 
tracing,    suggests  use  of  the  'stopped-spot'  principle  of  television  scanning,   as  follows: 


— '  Cook,   H.  D.     "A  study  of  print  reading  systems  leading  to  a  proposed  reader  for  typewritten 

material,  "    Ref.    80. 
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"In  this  system  the  scanning  spot  passes  rapidly  over  areas  of  no  contrast  but  halts 
temporarily  at  boundaries,   the  positions  of  which  are  coded  for  transmission.     The 
advantage  of  the  system  is  that  signal  redundancy  is  greatly  reduced.  "  }_/ 

In  effect,   a  holistic  representation  is  preserved  in  many  of  the  special  transformations  including 
those  of  optical  or  space  filtering  and  field  potential  derivatives,  fj     For  those  holistic  systems 
using  optical,   or  photographic,    template  reference  patterns,   we  note  that  the  reference  pattern 
vocabulary  may  be  composed  of  photographic  negative  images  only,   as  in  the  DOFL  First  Reader,   or 
of  both  positive  and  negative  patterns,   as  in  a  proposed  RCA  reader.     A  combination  of  photographic 
negative  masks  and  block  positive  masks  to  determine  relative  widths  and  heights  are  used  in  a  Baird- 
Atomic  prototype  page  reader.     The  use  of  positive  and  negative  masks  eliminates  the  problem  of 
more  than  one  character  covering  the  same  mask  (upper  case  "O"  and  "Q"  for  example).     Several 
systems  embody  this  principle.     For  instance,   Ress  and  Greanias,   in  a  patent  assigned  to  IBM,    dis- 
close the  use  of  both  positive  and  negative  masks.    3/  Both  positive  and  negative  masks  are  also 
suggested  for  use  in  the  Briggs  Associates  lenticular  array  system^/  for  appropriate  "select"  and 
"reject"  processing. 

In  the  analytic  approaches  to  various  possible  methods  for  scanning,    generation,   and  trans- 
formation of  an  appropriate  input  pattern,    given  a  source  pattern  sensed    within  the  image  field,   we 
find,   first,   the  operations  of  segmentation,   unitization  and  quantization  of  the  scanned  source  pattern 
field.     These  may  be  operations  of  quantization  in  terms  of  a  theoretically  superimposed  grid,    or  of 
encoding  of  relative  intensities  of  black  encountered  in  a  given  vertical  scan  segment  ('long', 
'intermediate1,   or  'short'black  blobs  in  a  single  vertical  scan  for  example)  or  of  'feature  extraction' 
by  local  operations.  2.1 

The  approaches  which  involve  what  we  have  termed  a  criterial  feature  extraction  treatment  of 
the  scanning  and  transformation  of  the  source  pattern  to  an  input  pattern,   and  its  subsequent  further 
transformation  into  the  requireu  format  of  reference  patterns,   include,    in  the  simple  case,   operations 
of  eliminating  redundancy  by  ignoring  source  pattern  'runs'  of  the  same  direction  or  without 
significant  change  in  character  of  successive  scan  segments.     They  also  include,    in  the  more  complex 
case,    derivations  such  as  those  in  the  Bledsoe -Browning  methodjy  of  'hit',    'partial-hit',   and  'no-hit', 
coincidences  of  the  input  pattern  (derived  by  rectilinear  mesh  quantization  from  the  source  pattern) 
with  arbitrarily  chosen  n-tuples  of  particular  coordinate  description  positions. 

As  we  have  noted,   this  latter  method  involves  analysis  in  terms  of  coincidence  with  random 
features --that  is,   the  arbitrarily  chosen  n-tuples.     A.technique  that  is  similar  in  terms  of  analysis 
of  random  features  has  been  proposed  by  Novikoff.   U   He  has  proposed  recognition  by  observing 
differences  of  frequency  of  crossings  encountered  with  a  'randomly  tossed'  reference  pattern  element, 
such  as  a  line  segment  or  curve.     Scores  so  obtained  can  serve  as  master  reference  patterns  which 
will  enable  subsequent  identification  of  unknown  characters  regardless  of  rotation  or  translation  in 
the  source  pattern  image  field. 

In  some  cases,   the  analytic  and  criterial  approaches  are  combined.     The  method  termed 
"analytic-descriptive"  by  Rosenblatt  may  involve  either  an  encoded  version  of  a  coordinate 
description  analysis,    such  as  for  example  a  function  table  entry  designation  for  a  particular 
combination  of  two -valued  variables,   or,   more  commonly,    the  use  of  distinctive  features  as  well 
as  analytic  processing.     Rosenblatt  defines  his  category  as  follows: 

"The  analytic-descriptive  method  consists  of  reducing  a  stimulus  pattern,   or  configuration, 
to  a  simple"]    canonical  description  which  is  invariant  under  the  transformation  in  question. 
This  description  (generally  given  in  terms  of  measurements  of  lines  and  angles,    ratios  of 
dimensions,    etc.  )  can  then  be  compared  with  a  stored  set  of  master-descriptions  to  determine 
which  corresponds  most  closely  to  the  stimulus  on  hand.  "_§/ 
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Pahl,    P.M.     "Automatic  character  recognition,"    Ref.    350,    p.    15. 

See  Kazmierczak,   H.     "The  potential  field  as  an  aid  for  character  recognition,  "  Ref.    246.. 

Ress,    T.I.  ,   et  al.     U.S.    Patent  2,  919,  425,    Ref.    384. 

Brown,   L.R.     "Nonscanning  character  reader  uses  coded  wafer,  "  Ref.    62. 

Bomba,    J.S.    "Alpha-numeric  character  recognition  using  local  operations,  "    Ref.    54. 

Bledsoe,    W.  W.   and  I.  Browning.     "Pattern  recognition  and  reading  by  machine,  "    Ref.    51. 

Novikoff,   A.  B.  J.    "Integral  geometry  as  a  tool  in  pattern  perception,  "  Ref.  340.    See  also 
Refs.    329,    341,   494. 

Rosenblatt,    F.     "Perceptual  generalization  over  transformation  groups,  "  Ref.    397,   p.    65. 
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6.  1.  1    Special  Transformations  in  Source  -to -Input -Pattern  Processing 

With  respect  to  systems  involving  holistic  scanning  and  sensing  of  the  source  pattern,   that  is, 
systems  that  preserve  and  process  the  whole  of  the  source  pattern  image,   those  systems  which 
utilize  special  transformations  are  of  particular  interest  from  the  standpoint  of  differences  in  design 
characteristics.     These  include,   first,  various    techniques  for  optical,   or  space,   filtering.     "Space 
filtering"  is  defined  by  Gilmore  _V   a-s  follows: 

"...    The  process  of  selectively  transmitting  the  spatial  frequency  components  of  a 
pattern  .  .   is  similar  to  the  selective  transmission  of  one -dimensional  frequency  components 
by  an  electrical  filter.     Any  image -forming  optical  system  acts  as  a  space  filter,    since  the 
images  it  produces  of  a  plane  pattern  are  consistently  distorted.  " 

A  number  of  suggestions  in  the  technical  literature  indicate  that  optical  filtering  and  Fourier 
transform  techniques  of  the  type  used  for  sharpening  and  re -focussing  of  poor  quality  photographic 
images  might  be  useful  in  the  field  of  automatic  character  recognition.    "./    Elias,    Grey,   and 
Robinson,    3/  for  example,   point  out  the  analogies  between  multi-dimensional  analysis  to  those  of 
electronic  filtering,    discuss  the  use  of  2-dimensional  filters  with  space -averaging  properties  for 
improvement  of  degraded  images  and  for  discrimination  in  the  presence  of  noise,   and  suggest  the 
applicability  of  these  techniques  to  problems  where  a  particular  configuration  is  to  be  sought  in  the 
image  field. 

The  so-called  "field  potential"  method  is  another  type  of  transformation.     This  method  is  under 
investigation  at  the  Technische  Hochschule,    Karlsruhe,    Germany.     As  described  by  Kazmierczak,    4/ 
there  is  a  preliminary  analytic  step  of  quantization,   but  it  is  claimed  that  the  subsequent  analysis  of 
the  potential  field  values  restores  enough  of  the  original  exact  shape  relationship  information  to 
provide,   in  effect,   a  holistic  basis  for  pattern-matching.      In  this  system,   the  output  of  the  scanning 
process  provides  a  quantized  voltage  distribution  in  20x10  shift  register  storage.     This  register  feeds 
a  network  of  resistors  to  establish  a  potential  field.     The  input  pattern  is  subjected  to  a  shifting 
process  such  that  it  is  centered  when  the  currents  through  left,    right,   top,   and  bottom  framing  leads 
are  equalized.     The  input  and  reference  pattern  elements  consist  of  the  values  of  the  potentials  and 
of  the  derivatives  of  each  of  the  points  in  the  field.     Currents  flowing  through  coupling  diodes  into 
the  resistor  network  are  also  used  in  terms  of  relative  inflow.     From  this  special  transformation  of 
the  source  pattern  into  a  field  potential  image,    such  discriminating  determinations  can  be  made  as: 
shapes  open  to  the  left,   or  to  the  right;  a  test  point  is  totally  enclosed  by  the  character  line;  a 
straight  line  is  to  the  right  of  the  test  point;  the  test  point  is  end  of  a  character  line^  and  the  like.     A 
first  model  of  this  system,   limited  to  an  initial  vocabulary  of  14  characters,    was  scheduled  for 
demonstration  in  April  1961.  _/ 

A  third  example  of  a  substantially  holistic  input  pattern  which  is  a  specialized  derivative  of  the 
source  pattern  is  that  of  the  photometric  analysis  system  of  Lohninger.  2.7   The  patent  disclosure  of 
certain  key  features  of  this  photometric  system  is  as  follows: 

"Light  reflected  from  the  white  background  area  around  the  particular  character  is 
projected  ...    to  an  image  splitting  device  which  divides  the  projected  image  of  the  display 
character  along  a  reference  line  into  a  first  part  and  a  second  part.     The  total  light  flux  of 
each  part  is  measured  and  converted  into  a  potential  whose  magnitude  is  proportional  to 
the  measured  value.     The  two  potentials  are  compared  with  each  other  and  the  differential 
obtained. 
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Gilmore,   H.  F.  ,    Jr.     "The  use  of  the  Fourier  transform  in  the  analysis  of  visual  phenomena,  " 
Ref.    161,   p.    14. 

See  for  example,    Cutrona,    L.  J.  ,    E.N.    Leith,    C.J.    Palermo,   and  L.J.    Porcello.     "Optical 
data  processing  and  filtering  systems,"  Ref.    85,   also  Kovasznay,    L.S.G.   and  A.   Arman. 
"Optical  autocorrelation  measurement  of  two-dimensional  random  patterns,  "  Ref.    262. 

Elias,    P.,    D.  S.   Grey,   and  D.  Z.    Robinson.     "Fourier  treatment  of  optical  processes,  "  Ref.  119. 

Ref .  246;  see  also  Auerbach,    Ref.    21,    p.    333. 

A  similar  technique  is  disclosed  by  Steinbuch  in  British  Patent  825598,   assigned  to  Standard 
Telephones  and  Cables,    Ltd.  ,    covering:     "Character  recognition  by  examining  predetermined 
points  in  a  character  area,    and  applying  electrical  conditions  to  corresponding  points  in  an 
electrical  arrangement  so  as  to  set  up  a  plane  field  of  flow  or  potential  which  simulates  the 
character;  means  responsive  to  the  flow  or  potential  conditions  give  an  identifying  output.  " 
(Ref.  445,    see  also  Ref.  454. ) 

Lohninger,   W.J.     U.S.    Patent  2,  927,  2 16,    Ref.   278. 
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"The  differential  signal  is  fed  to  a  servo  system  which  is  coupled  to  orient  the  projected 
image  relative  to  the  image  splitting  device  to  reposition  the  projected  image  relative  to  the 
reference  line  until  the  total  light  flux  from  the  first  part  is  equal  to  the  total  light  flux  from 
the  second  part.  The  physical  displacement  of  the  projected  image  relative  to  the  reference 
line  at  condition  of  balanced  light  flux  will  vary  for  different  characters  and,  therefore,  this 
displacement  can  be  utilized  to  identify  the  displayed  character  presented. 

"In  the  identification  of  a  large  number  of  distinctive  characters  two  or  more  different 
characters  may  produce  the  same  displacement  at  balance.     To  prevent  the  generation  of 
spurious  results  the  character  can  be  further  divided  .  .  .    The  light  flux  from  each  part  is 
then  balanced  about  a  separate  reference  line  and  .  .  .   values  representative  of  the  displace- 
ment of  the  projected  image  relative  to  the  reference  lines  from  all  of  the  initially  split  parts 
is  obtained  .  .  . 

"The  projected  image  or  each  part  of  the  projected  image  can  be  displaced  angularly 
or  along  a  straight  line  relative  to  a  reference  line. 

"...    The  projected  image  may  be  divided  to  present  two  identical  images  .  .  . 

"One  image  can  be  displaced  along  a  straight  line  while  the  other  image  is  displaced 
angularly  to  obtain  two  values. 

"Thus,   by  utilizing  linear  displacement,  angular  displacements,    or  a  combination  of 
linear  and  angular  displacements,    discrete  values  or  photometric  centers  can  be  obtained 
for  each  character  presented  for  identification.  " 

Here  again  we  find  that  information  with  respect  to  the  whole  character  is  preserved  in  the  input 
pattern  transformations. 

The  various  types  of  special  transformations  may,   however,   be  applied  in  techniques  utilizing 
analytic  or  criterial  principles.     Horwitz  and  Shelton,    for  example,   have  recentLy  proposed  an  optical 
autocorrelation  technique  in  conjunction  with  a  coordinate  description  method.    ~J    That  is,   they 
assume,   first,   that  a  binary  representation  of  the  source  pattern  is  derived  by  placing  a  discrete 
mesh  or  cell  array  over  the  character.     They  then  describe  both  optical  and  electronic  means  for 
deriving  the  autocorrelation  function,    which  they  explain  as  follows: 

"The  geometrical  interpretation  of  the  autocorrelation  function  is  that  the  value  of  the 
autocorrelation  function  at  any  specified  point  is  proportional  to  the  number  of  pairs  of 
occupied  points  having  a  given  displacement  and  direction.     Physically,   this  is  equivalent 
to  taking  the  same  character  on  two  matrices  and  shifting  one  with  respect  to  the  other 
through  all  possible  translations  and  counting  the  number  of  coincidences  for  each  relative 
translation.  "  _£/ 

This  autocorrelation  method  thus  provides  a  translation-invariant  input  pattern,    regardless  of 
the  original  position  of  the  source  pattern,   with  the  number   of  pairs  of  each  category  of  relative 
separation  between  'black'  cells  of  the  coordinate  description  array  plotted  as  a  function  of  each 
category  of  relative  separation.     As  such,   the  method  is  to  be  contrasted  with  cross -correlation 
methods,   in  which  optical  arrangements  involving  reference  pattern  apertures  produce  intensity 
distributions  in  the  output  plane.     These  distributions  are     such  that,   for  the  case  where  the  input 
pattern  exactly  coincides  with  the  reference  pattern  the  effect  is  the  same  as  in  direct  matching,   but, 
for  misregistered  source  patterns,    a  number  of  anomalies  may  arise.. 

Horwitz  and  Shelton  also  consider  the  '180°  ambiguity  problem1,   by  virtue  of  which  discrimina- 
tion is  not  possible  for  certain  characters  in  certain  fonts.     For  example,   a  numeric  '6'  and  a 
numeric  '9',    or  a  lower  case  'p'  and  'd',   may  be  rotational  transforms  of  each  other.     This   180°- 
rotation  ambiguity  problem  is  attacked  by  considering  an  optical  arrangement  which  is  a  modification 
of  a  system  proposed  earlier  by  Kovasznay  and  Arman  J/  to  provide  autocorrelation  in  the  vertical 
direction  and  an  autoconvolution  operation  in  the  horizontal  direction. 


— '  Horwitz,    L.  P.   and  G.  L.   Shelton,    Jr.     "Pattern  recognition  using  autocorrelation,  "  Ref.    220. 

-i         Ref.   220,   p.    183. 
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See  Ref.   262. 
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Among  the  optical  schemes  discussed  for  achieving  the  autocorrelation  function  is  a  method 
involving  the  Fraunhofer  diffraction  pattern  JV  to  provide  a  'wave  number  spectrum1  derived  from  a 
given  source  pattern,   which  is  then  compared  with  that  from  a  normalized  mask  for  each  of  the  ideal- 
character  reference  patterns.     Horwitz  and  Shelton  note,  however,   that  most  of  the  optical  trans- 
formation procedures  have  certain  disadvantages.     Examples  are  the  requirement  that  the  source 
patterns  be  available  in  the  form  of  transparent  images  (e.g.,   photographic. negatives),   the  limitation 
of  vocabulary  to  be  handled  at  one  time  in  a  parallel  matching  operation,  .-/and  the  difficulties  in 
establishing  a  balance  between  individual  character  separability  and  multiple  font  recognition 
possibilities. 

Electronic  methods  for  realization  of  the  autocorrelation  model  of  character  recognition  are 
discussed,   especially  generation  of  the  required  input  pattern  by  putting  the  same  quantized  bit 
pattern  into  two  shift  registers  and  tallying  the  coincidences  occurring  each  time  one  of  these 
registers  is  shifted  with  respect  to  the  other.     A  partial  autocorrelation  function  derived  from  the 
counters  at  the  time  when  input  first  enters  the  register  serves  as  an  additional  input  pattern  element 
for  purposes  of  distinguishing  a  given  source  pattern  from  its  180°  rotation  equivalent. 

6.  1.  2    Characteristics  of  Selected  Coordinate  Description  Methods 

Techniques  for  automatic  character  recognition  which  we  have  termed  'coordinate  description 
methods'  are  closely  related  to  Cook's  'image  point  analysis'  category.     They  may  be  employed  in 
either  an  essentially  holistic  approach,   as  in  Taylor,  ±J  or  in  an  analytical  one,   as  in  Bomba's 
feature  extraction  system.    4y    Comparative  design  characteristics  that  are  of  particular  interest  with 
respect  to  these  techniques  are  those  of  degree  of  mesh,   or  resolution,   and  of  the  thresholds  or 
clipping  levels  for  integration  or  quantization. 

If  the  source  pattern  is  to  be  subdivided  into  discrete  sub-areas  as  a  part  of  the  input  process, 
it  is  obvious  that  some  choice  as  to  the  size  of  these  sub-areas  must  be  made.     If  the  sub-area  is 
large  with  respect,    say,   to  a  character  stroke  (low  resolution),   then  only  gross  discriminating 
possibilities  will  be  available.     Generally,   only  a  small  vocabulary  of  recognizable  patterns  can  be 
tolerated.     On  the  other  hand,   if  the  sub-areas  are  small  with  respect  to  character  strokes  (high 
resolution)  then  random  noise,   whether  additive  or  reductive,   may  seriously  interfere  with  correct 
identification  decisions.     Thus  Unger  emphasizes  the  following  points: 

"Regardless  of  how  fine  a  grid  is  used,   there  are  still  important  transformations  Which 
occur  when  a  continuous  line  is  mapped  onto  a  discrete  grid  .  .  . 

"It  is  important  that  any  system  for  pattern  detection  or  recognition  be  relatively  in- 
sensitive to  minor  irregularities  in  the  input  fields.     Interchanging  zeroes  for  ones  in  a 
few  isolated  cells  should  not  cause  significant  changes  in  the  output  of  a  pattern  processing 
system  .  .  .     Some  smoothing  is  achieved  merely  by  quantizing  the  input.     This  however,   is 
not  adequate,   and  in  some  cases  the  quantization  itself  introduces  irregularities.  "  _' 

In  addition,    there  are  usually  more  reference  pattern  elements  to  be  stored  in  the  high 
resolution  systems.     If  multiple  patterns  for  the  same  'character'  subject  to  translation,    rotation, 
and  size  variations  are  to  be  accommodated,   the  storage  problem  increases  exponentially.     It  is  for 
this  reason  that  Fain  _/  and  others  make  the  significant  point  that  the  elimination  of  the  non-criterial 
information  should  be  accomplished  as  early  in  the  scanning -input  process  as  possible. 
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Ref.    220,   pp.    179-180. 

That  is,    "The  size  of  the  alphabet  that  can  be  handled  in  a  parallel  optical  scheme  is  limited  by 
the  total  light  energy  available  from  the  source;  the  energy  in  each  channel  goes  down  with  the 
reciprocal  of  the  number  of  channels.  "    (Ref.    220,    p.    175.) 

The  Taylor  system,   as  we  have  previously  noted,    does  allow  quantization  of  the  original  source 
pattern  field,   but  provides  a  number  of  different  possible  values,    representing  different  per- 
centage coverages  of  black  in  a  given  sub-area,    on  the  grounds  that  integration  to  a  binary  level 
would  seriously  impair  the  kind  of  flexibility  found  in  human  recognition.     (Taylor,    W.K. 
"Pattern  recognition  by  means  of  automatic  analogue  apparatus,  "    Ref.    482,    p.    198.) 

Bomba,   J.S.    "Alpha -numeric  character  recognition  using  local  operations,  "  Ref.    54. 

Unger,   S.H.  ,    "Pattern  detection  and  recognition,  "  Ref.    501,    pp.    1738,    1739. 

See  Fain,    V.S.,   Refs.    127,    128. 
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The  resolution  or  fineness  of  mesh  of  the  coordinate  description  grid  actually  used  in  different 
systems  varies  widely.     Some  examples  ranging  from  low  to  relatively  high  resolution,   as  used  in 
actual  systems  or  in  related  pattern  recognition  experiments,   are  shown  in  Table  III. 

It  has  been  widely  noted  that  the  scanning -input  pattern  processing  steps  of  most  recognition 
systems  are  information-destructive  operations.    W  The  extent  of  information  loss  in  the  trans- 
formation of  the  source  pattern  to  the  input  pattern  is,   in  coordinate  description  techniques,    related 
both  to  the  resolution  of  the  grid  and  to  the  thresholds  for  integration  or  quantization.     As  Ress  and 
Greanias  point  out,   the  level  at  which  clipping  occurs  establishes  the  threshold  or  criterion  for 
distinguishing  between  black  and  white:    "That  is,   a  decision  is  made  as  to  whether  the  area  being 
scanned  is  dark  enough  to  be  considered  black.  "    JJ 

In  general,   the  threshold  value  usually  used  is  that  of  'more  than  half.     That  is,   if  more  than 
50%  of  the  area  covered  by  the  grid  cell  is  black,   the  whole  cell  is  treated  as  though  it  were  black,   and 
conversely  for  white.     Thus,   Baumann,   among  others,    considers  that:     "An  elemental  area  is 
considered  to  be  one  bit  of  character  information  if  more  than  50  per  cent  of  the  elemental  area  is 
covered  by  the  character.  "  3/  Wada  4/ and  Iijima  j>/  provide  for  a  system  design  where  the  more 
general  case  is  considered,   that  is,   where  a    k    proportion  of  the  area  to  be  quantized  is  the  threshold 
value.     Iijima  in  particular  considers  the  case  where    k     is  about  16%  of  the  elemental  area. 

Among  the  various  design  characteristics  that  have  been  suggested  for  the  integration  or 
averaging  operation  is  that  of  Greanias,   in  a  patent  assigned  to  IBM,   which  uses  a  small  diameter 
scanning  spot  and  a  data  consolidation  circuit.     "The  data  gathered  by  the  photomultiplier  during  a 
predetermined  number  of  these  sweeps  is  amplified  and  clipped  at  upper  and  lower  levels  to  provide 
significant  data  to  an  accumulator  means  which  can  remember  what  percentage  of  the  time   the 
presence  of  a  portion  of  a  character  existed  during  said  predetermined  number  of  scans.  "  2/    After 
the  last  sweep,    determination  is  made  as  to  whether  or  not  that  percentage  of  time -seeing -black  is 
sufficient  to  match  the  threshold  established  for  considering  the  entire  sub-area  scanned  to  be  black. 
In  a  later  patent  issued  to  Greanias,   Hamburger,   and  Leimer,  _J/  further  developments  with  respect 
to  integration  are  discussed.     These  include  determinations  of  the  relative  location  within  the  sub- 
area  scan  cycle  of  detected  black,   and  a  second-level  of  integration  by  local  context—that  is, 
reference  to  the  results  of  scan  obtained  for  a  neighboring  sub-area,    such  as  the  previously  integrated 
cell  that  is  immediately  adjacent  in  either  a  horizontal  or  vertical  direction. 

A  second  level  of  integration  or  quantization  is  thus  developed  in  some  analytic  systems  to 
achieve  input  pattern  improvement.     This  is  usually  accomplished  by  local  averaging  operations  or 
spatial  transformations  of  adjacent  input  pattern  elements.     This  type  of  process  is  described  by 
Kamentsky  as  follows: 

"The  signal  field  is  usually  resolved  into    n    independent  elements.     During  a  spatial 
transformation,   the  state  of  each  element  is  examined.     The  state  of  this  element  and  other 
elements  whose  coordinates  are  specified  with  respect  to  the  examined  one  are  functionally 
related  by  a  specific  rule  to  determine  the  transformed  state  of  the  examined  element.  "  8/ 

We  shall  therefore  consider  such  local  operations  with  other  image  enhancement  and  input  pattern 
improvement  operations,   below. 
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Compare,   for  example,   Kamentsky:    "The  pattern- recognition  machine  is  thus  a  device  for 
performing  an  information-destructive  transformation  on  the  signal  field  .  .  .     Internally,   the 
machine  may  perform  a  sequence  of  information-destructive  transformations.  "    (Ref.   244, 
p.    304.) 

Ress,  T.  I.,   et  al.      U.S.    Patent  2,  919,  425,   Ref.    384. 

Baumann,    D.  M.  ,   et  al.     "Character  recognition  and  photomemory  storage  devices  feasibility 
study,  "  Ref.    43. 

Wada,   H.  ,   et  al.     "An  electronic  reading  machine,  "    Ref.    521. 

Iijima,    T.     "Basic  theory  of  pattern  recognition,  "    Ref.    225. 

Greanias,   E.C.     U.S.    Patent  2,  928,  073,   Ref.    174. 

U.S.   Patent  2,959,  769,   Ref.    173. 

Kamentsky,    L.  A.     "Pattern  and  character  recognition  systems;  picture  processing  by  nets  of 
neuron-like  elements,  "    Ref.    244,   p.    305. 
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TABLE  III.     EXAMPLES  OF  COORDINATE  DESCRIPTION  RESOLUTION 


Horizontal  x  Vertical 

Total 

System 

3x5 

15 

Early  READATRON  prototype, 
National  Data  Processing  Corporation 

4x5 

20 

Fitch  patent 

5x7 

35 

RCA  and  NBS-DOFL  standardized  font 

5x9 

45 

Suggested  standardized  font,    X3-1 

6x9 

54 

GE    69-A    font  for  optical  scanning 

6  x  10 

60 

READATRON  prototype  for  Farrington 
selfchek  font 

7  x  10 

70 

IBM  1210  Sorter- Reader  (MICR) 

8  x  10 

80 

Howard  experiments 

5  x  16 

80 

Burroughs -Control  Instrument 
reader  for  National  City  Bank 

9x9 

81 

Taylor  system 

10  x  10 

100 

Weaver  patent 

12  x  14 

168 

Solartron  ERA  prototype,   actual 
character  space 

10  x  20 

200 

Kazmierczak  system 

12  x  22 

264 

Philco  system 

18  x  18 

324 

Glover  experiments 

20  x  20 

400 

Uhr-Vossler  experiments 

16  x  28 

448 

Solartron  ERA,    space  in  which 
characters  may  appear 

25  x  25 

625 

General  Motors  program 

30  x  32 

960 

Hill  studies  at  IBM; 
Baran  and  Estrin  program 

32  x  32 

1 

024 

Doyle  program  for  hand- 
printed characters 

40  x  64 

2 

560 

Grimsdale,    Sumner  system 

60  x  90 

5 

400 

Bomba  experiments 

90  x  90 

8 

100 

1955  Selfridge-Dinneen  experiments 

100  x  100 

10 

000 

Hodes  experiments;  R.N.   Shepard  studies 

176  x  176 

30 

976 

SEAC-SADIE  input  for  picture  processing 
experiments  at  NBS 
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6.  1.  3    Various    Methods  of  Input  Pattern  Improvement 

Image  improvement  operations  may  occur  in  the  source-to-input-pattern  sensing  and  trans- 
formation process-steps  for  any  of  the  input  methods,   whether  holistic,   analytic,   or  criterial.     In  the 
DOFL  First  Reader  holistic  method,   for  example,   a  procedure  for  integrating  over  small  sub-areas 
of  the  source  pattern  provides  for  reduction  of  noise  and  for  image  enhancement  although  it  is  still  the 
pattern  as  a  whole  that  is  subjected  to  template  matching  in  the  recognition  process. 

An  example  of  design  characteristics  used  to  effect  certain  combinations  of  input  pattern 
improvement  operations  in  a  system  that  is  in  actual  operation  is  found  in  the  Farrington-IMR  Scandex 
reading  machines.     First,   a  video  channel  is  provided  to  amplify  the  video  signal  to  each  of  two 
scanner  photomultipliers.     A  feedback  voltage  is  developed  which  controls  sensitivity  and  sets  a 
threshold  which  pulses  must  exceed  if  a  signal  is  to  be  produced.     The  pulses  are  squared  and  clipped 
at  the  appropriate  voltage  level.     Further  details  are  as  follows: 

"The  feed-back  voltage  operates  to  maintain  constant  voltage  difference  between  the  dark 
pulse  and  the  observed  document  background.     The  dark  pulse  is  then  reduced  to  an  arbitrary 
grey  level  marking  the  lightest  printing  safe  for  the  machine  to  read.     The  threshold  which  , 
pulses  must  exceed  is  then  automatically  adjusted  by  the  grey  pulse  or  scanning  signals, 
whichever  has  greater  amplitude.     The  resulting  signals  are  sharpened,   the  grey  pulse  removed 
and  the  residual  signal  clipped.     The  processed  signal  has  greatly  improved  contrast  and  a  near 
optimum  signal  to  noise  ratio  over  the  usable  range  of  character  darkness.  "    ±J 

Input  pattern  improvement  achieved  by  treating  as  redundant  adjacent  source  pattern  information 
if  there  is  contiguity  in  direction,    color,    etc.  ,    is  found  in  many  of  the  criterial  feature  extraction; 
techniques.     Baumann,   in  proposing  what  he  calls  "weighted  area  scanning"    techniques,    ~/    suggests 
first  that  there  should  be  a  reduction  of  character  information  at  the  transducer  in  order  to  reduce  the 
complexity  of  the  subsequent  recognition  logic.     It  is  observed  further  that: 

"Since  .  .  .   much  of  the  character  information  does  not  uniquely  describe  the  characters, 
and  .  .  .   all  of  the  allowable  area  is  not  covered  by  the  character,    some  method  of  disregarding 
all  the  redundant  information  is  implied  in  the  desire  to  reduce  the  information  at  the  trans- 
ducer.    Because  the  character  information  of  the  symbol  is  to  be  compared  to  a  storage  that 
also  contains  character  information,   a  logical  procedure  is  to  compare  the  symbol  to  a  photo- 
graphic transparency  which  has  opaque  areas  at  all  the  character  information  positions  which 
are  not  required  to  be  recognized  and  therefore  produces  a  substantial  reduction  of  redundant 
information  .  .  . 

"The  amount  of  reduction  of  information  is  dependent  primarily  upon  the  configuration 
of  the  mask  .  .  .    The  desired  procedure  is  to  have  several  different  masks,    each  one 
designated  to  sort  out  a  particular  subclass  of  characters.  " 

The  system  design  problem  with  respect  to  the  criterial  feature  (or  area)  improvements  to  the  input 
pattern  is  of  course  to  achieve  a  proper  balance  between  elimination  of  information  not  necessary  for 
discrimination  and  retention  of  sufficient  redundancy  to  enable  identification  of  somewhat  noisy  or 
poor  quality  characters.     In  many  systems,   it  is  necessary  first  to  center  or  frame  the  character 
pattern. 

As  we  have  noted  previously,    several  systems  utilize  a  shift  register  technique  either  for 
multiple  comparison  matchings  of  the  input  pattern  (i.  e.  ,   the  pattern  in  a  number  of  successively 
displaced  positions)  with  the  reference  patterns,   or  for  transformations  that  bound,    frame,   or  center 
the  character.     This  latter  purpose  can  be  achievedby  detection  of  the  first  coincidence  of  'black' 
with  top,   bottom,   and  side  edges,    3/ or  by  determination  of  the  'center  of  gravity'  of  the  displayed 
'black'  configuration.     Center -of -gravity  positioning  may  be  especially  important  in  cases  where  the 
source  patterns  are  likely  to  be  badly  smeared,   in  one  or  another  direction.     Smearing  often  results 
from  the  imprinting  process,    such  as  that  of  high  speed  printers.     A  special  case  of  center-of- 
gravity  positioning,   with  respect  to  both  horizontal  and  vertical  axes,   is  included  in  the  photometric 
analysis  recognition  method  of  Lohninger,   as  previously  cited.  _f/ 

In  addition  to  the  obvious  improvement  in  the  input  pattern  by  normalizing  its  location  with 
respect  to  a  particular  reference  pattern  framework,    some  of  the  special  transformations  are 


1/ 

V 
4/ 


Shepard,    D.  H.  ,    P.  F.    Bargh,   andC.C.   Heasley,    Jr.     "A  reliable  character  sensing  system 
for  documents  prepared  on  conventional  business  devices",    Ref.    419,   p.    112. 

See  Baumann,    D.  M.  ,    et  al.     "Character  recognition  and  photomemory  storage  devices 
feasibility  study,  "    Ref.    43,   pp.    3,    7.     See  also  Baumann,   Ref.   44. 

Evey,   for  example  (Ref.    126),    describes  a  'lower  right  hand  corner'  position-normalization. 

See  Ref.    278,   and  p.    130  of  this  report. 
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performed  to  carry  out  improvements  which    normalize  size.     Harmon  has  developed  a  special  device 
for  Gestalt-type  recognition  of  line  drawings  of  circles,   triangles,   squares,   etc.   discusses  the  use 
of  a  dilating  circular  scan.  }J  With  this  scanning  method,    similar  transformations  are  obtained  for 
geometrically  similar  figures,   with  variations  of  source  pattern  size  being  translated  into  time-of- 
arrival  changes  in  the  derivation  of  the  input  pattern.     Topological  relationships  are  preserved  under 
rotation. 

Optical  filtering  to  produce  Fourier  transformations  is  another  possibility.     Thus,   Kelly  and 
Singer  state: 

"Another  useful  property  filter  would  be  one  which  yielded  Fourier  components  of  the 
spatial  pattern.     The  possibility  exists  of  obtaining  form  information  independent  of  pattern 
size  and  orientation  from  such  a  filter.  "    2/ 

The  use  of  a  variety  of  cleanup,  line -thinning  or  line -thickening,  integration  over  small  local 
areas,  edging,  and  similar  operations,  has  already  been  mentioned.  Similar  techniques  have  been 
investigated  in  Russian  work  in  the  field.     Blokh,   for  example,    states: 

"As  a  rule,   input  images  are  subjected  ...   to  specially  designed  transformations  which 
emphasize  the  more  distinctive  characteristics,   remove  minor,  unessential  details,   and  even 
replace  given  images  by  others  more  convenient  .  .  .   Such  transformations,   the  more  common 
of  which  are  centering,   contouring,   filtering  and  others  . .  .    convert  the  input  collection  of 
images  into  others  which  will  be  termed  preconditioned  images.  "  3/ 

Various  proposed  operations  to  reduce  noise  by  local  sub-area  averaging  to  standardize  line 
widths,   to  extract  criterial  features,   and  to  ascertain  the  relative  location  and  size  of  such  extracted 
features,   are  exemplified  in  the  system  utilizing  local  operations,   described  by  Bomba.  4/  This  has 
been  tested  on  handprinted  samples  of  34  alphanumeric  characters,  using  a  scanner  developed  by 
Highleyman  and  Kamentsky   jy  and  simulation  on  an  IBM  704  computer. 

In  this  system,   the  initial  input  pattern  consists  of  a  60x90  array.     Sub-areas  that  are  subjected 
to  local  operations  range  from  a  3x3  section  for  local  averaging  to  a  15x15  section  which  allows  for 
feature  extraction  for  up  to  70  cells  extending  from  the  local  sub-area  center.     The  local  averaging 
operations  to  reduce  noise  are  applied  first.     They  tend  to  reduce  irregularities  along  the  character 
contour  and  small  holes  or  missing  black  in  the  body  of  character  strokes.     For  each  3x3  cell  array, 
a  center  cell  that  was  initially  sensed  as  'white'  will  be  re -recorded  as  'black'  if  at  least  5  of  the 
surrounding  8  cell  positions  are  black,   and  a  'black'  will  be  rewritten  as  'white'  if  five  or  more 
immediately  adjacent  neighbors  are  white.     Otherwise  no  change  will  be  made  of  the  center  cell.     The 
process  may  be  repeated  one  or  more  times.     Line-width  normalization  in  this  Bomba  system  thins 
down  a  character  stroke  line  that  is  more  than  four  cell  units  wide  to  a  uniform  average  width  of  four 
units.     Similarly,    it  thickens  strokes  which  are   narrow. 

For  criterial  feature  extraction  by  Bomna's  local  operations,   a  program  called  'Feastract'  is 
used.     The  local  area  operated  upon  is  a  radial  pattern  built  up  of  the  appropriate  combinations  of 
smaller  local  areas.     To  fine,   for  example,   an   L-shaped  feature,   a  reference  pattern  element 
consisting  of  a  cell,    P,   the  seven  cells  directly  above  P  in  a  vertical  line,   and  the  seven  cells 
extending  in  the  horizontal  line  to  the  right  of  P,   is  moved  over  the  input  pattern  field.     This  is  done 
in  a  scanning  manner  so  as  to  detect  coincidence  of  black  cells  in  the  input  pattern  local  area  so 
scanned  with  the  cells  of  the  L-feature  extraction  pattern.     When  there  is  coincidence  for  all  the 
designated  cells,   then  an  L-feature  signal  is  recorded  in  a  buffer  image  for  this  feature  at  the  same 
respective  coordinates  which  the  cell  'P'  then  has.     By  dividing  these  secondary  input  patterns  (buffer 
images  for  each  extracted  feature)  into  zones,   the  recognition  logic  may  take  account  of  relative 
location  and  connectivity  of  the  features  that  have  been  detected.     The  input  pattern  has  thus  been 
improved  by  noise  reduction,   line-width  standardization,   and  discarding  of  non-criterial  information. 


1/ 
2/ 

2/ 
V 

5/ 


Harmon,    L.  D.     "A  line-drawing  pattern  recognizer.  "    Ref.    194. 

Kelly,    P.M.   and  J.  R.   Singer.     "Bio-computer  design,  "  Ref.  251,   p.    1-7,   also  Ref.  252,  p.  221. 

Blokh,    E.  L.      "The  question  of  minimum  description.  "    Ref.    53,   pp.    17-18. 

Bomba,   J.S.     "Alpha -numeric  character  recognition  using  local  operations,  "    Ref.    54. 

Highleyman,    J.  W.H.   and  L.A.   Kamentsky.      "A  generalized  scanner  for  pattern  and  character 
recognition  studies",   Ref.    211. 
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Criteria  for  local  sub-area  averaging  or  spatial  transformations  for  noise  reduction  other  than 
those  used  by  Bbmba  are  also  available.     Kamentsky,   for  example,    considers  both  immediate- 
neighbor  and  'directed  connection'  effects,   allowing  for  greater  influence  of  an  immediate  neighbor  if 
this  cell  itself  has  an  immediate  neighbor  in  a  given  direction  that  is  of  interest,   e.  g.  ,  because  of 
expected  continuity  of  a  line.     As  Kamentsky  clearly  implies,   immediate  neighbor  local  averaging  is 
in  effect  a  case  of  what  he  terms  a  two-valued  Z-gate  threshold  logic,   that  is,   the  inputs  to  the  Z-gates 
may  have  either  the  value  1,    'black',  or  0,    'white'.     Kamentsky  also  considers  the  case  of  three- 
valued  Z-gating,   described  as  follows: 

"The  general  elements  Nj  may  have  any  number  n    of  inputs  Xij  each  taking  on  the  value  0, 
+  1,  .or  -1.     All  elements  are  controlled  by  setting  a  'threshold  Z'.     Z  can  take  on  the  values  o  to 
n.     The  elements  have  a  single  output  Yj    •     Yj    is  either  0  or  1  based  on  the  criteria: 

n 
Y.    =    1  if     2        X,;    >     Z, 


•  YJ 

is  i 

3  it 

n 

2 
i=l 

Xij 

> 

n 

2 
i=l 

Xij 

< 

Y.   ■■  :  '  -' 

3 

"We  note  that  this  three-valued  Z-gating  logic  might  be  required  in  certain  possible  implementa- 
tions of  the  black,   white,   and  grey  (or  indifferent)  'meshes'  (cells)  of  the  Japanese  character 
recognition  system  discussed  by  Wada.  jV    Systems  such  as  that  proposed  by  Taylor  3/offer  the 
further  complication  that  a  given  cell  may  have  many  values  (recording  the  observed  intensity  of  black 
in  the  sub-area  of  the  source  pattern  which  it  represents).     The  question  of  proper  thresholds  for 
averaging,   if  used,  and  the  design  of  Z-gating  criteria  for  such  systems  would  present  formidable 
difficulties  with  respect  to  the  many  combinations  of  inputs  active  (neighbor  cells  black)  and  of 
weights  of  those  active  that  would  be  equivalent  for  some  particular  value  of  Z.     That  is,   under  juch 
conditions,   certain  different  patterns  would  not  be  separable  with  respect  to  thi6  type  of  operation. 
Figure  22  shows  examples  of  patterns  that  would  be  equivalent  with  respect  to  the  processing  of  the 
center  cell  of  a  3x3  local  area,   allowing  the  values  of  0,    1,    ...   4  to  any  of  the  n  (=8)  inputs,   and 
setting  Z=n.     For  larger  arrays,   which  are  often  considered  desirable,  z/  the  problem  would 
obviously  increase. 

6.  1.  4    Examples  of  Criterial  Feature  Extraction 

In  the  system  for  character  recognition  using  local  operations  reported  by  Bomba,    17  different 
features  are  extracted  as  being  useful  in  identification-decisions.     These  include  horizontal  and 
vertical  lines,   slanting  lines  at  various  angular  displacements  from  the  vertical,   four  orientations  of 
both  T-shape  and  L-shape  line  intersections,   and  selected  orientations  of  V-shape  intersections.     ~J 
Demer,    Gaffney,   and  Rohland"/      search  for  criterial  features,   which  they  term  "signature 
components",   such  as  enclosed  white  areas,   long  and  short  positive  strokes,   right  and  left 
"overhangs",   and  vertical  lines  before  or  after  the  encountering  of  these  crossovers.     A  wide  variety 
of  different  features,   and  different  combinations  of  features,  have  been  used  in  different  criterial 
feature  analysis  systems. 

Sometimes  not  only  features  but  relationships  between  the  features  are  required  to  be  detected, 
as  in  Sutter's  patent  for  recognition  of  handwritten  numeric  characters,    such  as  the  following: 

"The  initial  stroke  of  the  symbol  or  numeral  is  inclined  downward  toward  the  right  .... 

"Within  the  second  zone  there  is  a  scanning  line  on  which  a  second  pulse  occurs,   this 
pulse  being  later  in  time  than  the  first  pulse  or,   in  other  words,   there  is  a  stroke  to  the 
right  of  the  stroke  being  followed,    .  .  . 
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Kamentsky,   L.A.     "Pattern  and  character  recognition  systems",    Ref.  244,   p.    306. 

Wada,  H.  ,   et  al.     "An  electronic  reading  machine",   Ref.  521. 

Taylor,   W.K.  ,   Refs.   478,    479,   480,   481,   482. 

Thus  Sherman  remarks:     "In  principle  one  would  desire  a  decision  function  for  anJ^xN 
aperture  which  would  define  the  central  element  as  a  function  of  the  entire  aperture  occupancy." 
Ref.   421,   p.   234. 

Bomba,   J.S.     "Alpha -numeric  character  recognition  using  local  operations",    Ref.    54. 

Demer,   F.M.  ,   J.  F.   Gaffney  and  W.  S.  Rohland.     U.S.    Patent  2,  963,  683,   Ref.   89. 
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Figure  22.     Examples  of  Equivalent  Patterns  in 
Z -Threshold  Local  Averaging 
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"Within  the  second  zone  there  is  a  scanning  line  on  which  the  stroke  being  followed  is 
intersected  by  a  horizontal  line  .  .  .  "  ±J 

Similarly,    several  of  the  Farrington-IMR  systems  use  not  merely  stroke  analysis  but  also  the 
detection  of  criterial  combinations  of  strokes,    described  as  follows: 

"The  machine  does  not  recognize  all  strokes  individually,   but  as  a  matter  of  programming 
convenience  requires  instead  that  some  be  recognized  in  the  following  patterns:    long  vertical 
left,   long  vertical  right,   horizontal  top,   horizontal  middle,   horizontal  bottom,    short  vertical 
upper  right  and  short  vertical  lower  left,    short  vertical  upper  left  and    short  vertical  lower 
right,    short  vertical  left  and  right  (simultaneously)  .... 

"Characters  are  identified  as  combinations  of  the  above  strokes  and  stroke  groups.     A 
character  is  identified  and  its  storage  code  signalled  only  if  the  strokes  which  describe  it 
are  detected  and  if  strokes  which  do  not  describe  it  are  not  detected.     — ' 

Other  types  of  criterial  features  extracted  in  various  source-to-input  pattern  transformations 
include  the  covering  by  the  character  of  a  specific  subset  of  'index  points'  as  in  Maul's  special 
font;  _£/   the  counting  of  the  number  of  character  stroke  crossings  encountered,   as  in  a  reader 
proposed  by  Pahl  for  a  lower-case  Cyrillic  alphabet;  2.1  the  determination  of  a  specific  combination 
of  points  at  which  the  relative  rate  of  change  of  direction  of  a  left-or  right -half  contour  line  is  . 

significantly  altered  (Sprick  and  Ganzhorn),    5/  and  the  computation  of  moments  as  proposed  by  Alt.—' 
Rochester,   in  the  'lakes'  and  'inlets'  method,    claims  that  one  of  the  novel  aspects  of  the  invention  is: 
"...   The  use  of  mathematical  topology  in  getting  at  the  crux  of  distinctive  features.  .  .   little  attention 
is  paid  to  the  lines  of  the  character  as  such,   but  the  lines  are  only  important  insofar  as  they  bound 
regions.  "  7/  For  identification  of  numerals  only,   Rochester  distinguishes  8  basic  shapes  such  as 
"tall  true  lake,  "  "long  vertical  black  line",   and  "small  left  inlet",   which,   in  proper  combinations, 
discriminate  between  the  characters  of  the  vocabulary. 

Criterial  feature  techniques  that  combine  various  topological  properties,    distinctive  area- 
covering  subpatterns,   and  specific  stroke -direction  clues,   include  those  of  Grimsdale  and  Sumner, 
Sherman,   and  Frishkopf  and  Harmon.     Grimsdale,   Sumner,   et  al,   at  Manchester  University,   have 
used  a  40x64  bit  matrix  oh  the  Mark  I  computer  to  determine  the  number,    size,    curvature,    relative 
length,   and  orientation  of  constituent  parts  or  segments  of  character  patterns.     They  have  described 
the  'grouping'  aspect  of  their  feature  extraction  process  as  follows: 

"The  scanning  programme  operates  by  analyzing  the  patterns  into  'groups'.  Broadly, 
a  group  is  defined  as  a  two-dimensional  collection  of  pattern  points,  the  horizontal  extent 
of  which  displays  no  sudden  changes  on  successive  lines.  "_§/ 

Sherman's  investigations  have  been  concerned  with  the  problems  of  recognition  of  hand-printed 
characters,   especially  the  search  for- character  invariants  where  the  source  patterns  may  vary  in 
size,    slant,    registration,    rotation  and  the  like.      He  notes  first  that  the  use  of  holistic  templates 
would  be  completely  impractical  because  of  the  enormous  number  of  possible  combinations  of  pattern 
parameters.     Sherman  has  therefore  turned  to  the  field  of  mathematical  topology,   with  particular 
reference  to  gr.aph  theory,   for  his  criteria  of  recognizability.        The  use  of  graph  theory  enables  the 
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encoding  of  a  given  pattern  in  the  form  of  a  connection  matrix.     The  rows  and  columns  of  thie 
connection  matrix  correspond  to  the  nodes  of  the  graph,    while  its  elements  might  correspond  to  the 
number  of  connections  or  line  segments  between  the  nodes.     However,   the  connection  matrix  in  this 
form  would  not  discriminate  between  characters  having  90°  or  180°  ambiguity,   or  otherwise  having 
topological  equivalence  (e.g.  ,    "S"  and  "2"). 

Therefore,   Sherman  also  uses  the  relative  geometric  positions  of  the  nodes.     A  3x3  local  sub- 
area  is  used  for  input  pattern  improvement  operations,    such  as  thinning  of  a  line  segment,   and  also 
for  the  derivation  of  distinctive  features  such  as  angle -of -connection  identification,   including  deter- 
mination of  initial  motion  in  exploring  out  from  a  given  node.     The  lines  connecting  nodes  are  encoded 
in  the  form  of  a  sequence  of  numbers,    each  of  which  represents  a  quantized  local  derivative.     "The 
sorting  process  which  will  follow  requires  that  each  of  these  sequences  be  characterized  as  a  con- 
catenation of  straight  line  segments,   or  of  straight  lines  and  arcs,    or  of  arcs.  "  )J  Topological 
sorting  is  followed  by  geometric  sorting,    where  two  line  segments  are  considered  to  be  in  the  same 
class  if  they  agree  in  number  of  line  segment,    the  allowable  range  of  angular  measurement  for  each 
segment,   and  relative  location  of  nodes.     The  proposed  System  has  been  programmed  for  the  Whirl- 
wind I  computer. 

A  similar  approach  utilizing  graph  theory  and  topological  differentiation  has  been  proposed  by 
Garmash,    Pereverzev,   and  Tsirlin,    in  Russia.     They  also  note  as  Sherman  does  that  topological 
differentiation  may  be  insufficient  for  final  identification  but  that  the  use  of  subsequent  non- 
topological  criteria  after  preliminary  sorting  can  be  simplified.     They  attack  the  rotational  ambiguity 
problem  by  specifying  a  definite  point  of  origin,    such  as  the  lower  left  hand  corner,   for  line  segment 
tracing,   and  they  propose  use  of  a  curve -tracking  scanner.  £/ 

The  problems  of  variability  and  the  search  for  relatively  invariant  features  in  individual  hand- 
printed or  handwritten  characters  are  still  further  aggravated  when  we  consider  the  possibilities  of 
automatic  machine  recognition  of  cursive  handwriting.     Research  projects  at  the  Bell  Laboratories 
have  been  directed  toward  possible  solutions  for  some  of  these  problems.     At  the  Fourth  London 
Symposium  on  Information  Theory  (August  I960),    Frishkopf  and  Harmon  reported  on  automatic 
recognition  of  lower  case  cursive  script,    where  the  writer  uses  a  captive  stylus  so  that  continuous  X 
and  Y  signals  are  generated  with  the  movements  of  his  pen.  _£/  The  image  field  involves  an  11x11 
quantization.     The  writer  is  subject  to  the  constraint  of  observing  a  baseline  and  that  he  attempt  to  fit 
those  letters  without  ascenders  or  descenders  into  the  space  between  the  baseline  and  a  parallel 
guideline  above. 

The  first  of  the  criterial  features  to  be  detected  in  this  system  is  that  of  relative  vertical 
extent,    so  that  a  rough  first  sorting  provides  groupings  of  characters  with  ascenders,    characters 
with  descenders,    characters  with  both,    and  all  others.     A  second  discriminating  characteristic  is  the 
presence  or  absence  of  retrograde  strokes.     Abrupt  changes  in  slope,   or  'cusps',   are  also  detected 
together  with  determination  of  location  of  occurrence  with  respect  to  the  zones  of  vertical  extent. 
Closure,   or  the  presence  or  absence  of  loops  or  near-loops,    provides  a  fourth  criterion.     Finally, 
special  marks  such  as  the  dotting  of  the  "i"  and  the  cross-bar  of  the  "t"  are  used. 

Appropriate  combinations  of  these  criterial  features  can  be  used  for  letter-by-letter  recognition 
of  handwritten  words  where  the  word  can  be  segmented  so  as  to  locate  its  letter  constituents  with 
reasonable  accuracy.  Frishkopf  and  Harmon  also  consider  possibilities  of  recognizing  the  hand- 
written word  as  a  whole,  again  emphasizing  that  the  highly  variable  and  non-essential  details  of  a 
particular  source  pattern  should  be  eliminated  as  far  as  possible  and  that  the  significant  features 
should  be  isolated  and  preserved. 

6.  2    Pattern  Compari  son  Processing 

After  scanning  and  transformation  of  the  input  pattern  have  been  carried  out,    whether  by 
holistic,   analytic,   or  criterial  techniques,    the  input  pattern  and  its  elements  must  be  matched  with 
the  reference  patterns  of  the  vocabulary.     The  processes  used  in  various  systems  for  comparison  of 
input  patterns  with  reference  patterns  generally  fall  into  three  distinct  categories.     These  are: 
template  matching,   decision-tree  processing,    and  parallel  (or  'Pandemonium')  processing. 

The  simplest  case  is,   as  we  have  seen,   that  of  template  matching.     It  requires  that  the  input 
pattern  as  a  whole  or  in  each  of  all  its  parts  be  matched  with  some  one  reference  pattern  as  a  whole. 
Template  matching  techniques  may  be  applied  to  holistic  patterns  that  are  photographic  negative 
images  of  all  anticipated  character  patterns,   or  that  are  complete  coordinate  descriptions  of  the 
quantized  pattern,   or  that  are  exact  descriptions  of  contour-tracings.     Examples  of  different  design 


— '  Sherman,   H.     "A  quasi-topological  method  for  machine  recognition  of  line  patterns",    Ref.  421, 


u 

11 


p.    235. 

Garmash,    V.A.  ,    V.  S.  Pereverzev,   and  V.  M.  Tsirlin.     "Quasitopological  method  of  letter 
recognition",    Ref.    155. 

Frishkopf,    L.S.  ,   and  L.  D.  Harmon.    "Machine  reading  of  handwriting",    Ref.  151. 

98 


characteristics  used  in  template  matching  comparison  techniques  include  those  of  Handel,   Ayres, 
and  McNaney. 

Handel's  patent,   assigned  to  the  General  Electric  Company,    was  filed  in  1931  and  awarded  in 
1933.  _y    It  provides  that  the  reference  pattern  vocabulary  be  recorded  as  stencil  images  on  a  rotat- 
able  disk.     The  carrier  item  is  illuminated  by  a  suitable  light  source  and  a  lens  system  projects  the 
image  of  the  source  pattern  for  comparison  matching  against  the  disk.     The  stencils  are  exact 
duplicates,   in  shape  and  proportion,   of  the  expected  source  patterns.     As  the  disk  is  rotated,   there  is 
one  and  only  position  in  each  rotation  cycle,    "in  which  the  image  formed  thereon  exactly  coincides  with 
one  of  the  stenciled  numbers".     At  such  time,   a  minimum  value  of  light  is  transmitted  through  the 
disk  to  a  photo-electric  device.     The  photo-electric  cell  is  connected  with  an  amplifier  so  constructed 
that  it  will  deliver  a  tripping  voltage  to   means  for  controlling  a  selector  switch  only  when  the  light 
received  reaches  a  predetermined  minimum  value. 

In  the  Ayres  patent,  _/ awarded  in  1938  and  assigned  to  IBM,   a  sequentially  ordered  processing 
of  input  pattern  elements  with  reference  pattern  elements,   by  simultaneous  scanning  of  both,   is 
disclosed.     It  is  assumed  that  both  the  source  pattern  and  the  reference  patterns  are  illuminated  and 
that  the  reflected  light  is  transmitted  to  separate  photocells.     Scanning  is  provided  by  a  disk  having  a 
number  of  apertures  smaller  than  the  character  to  be  scanned,   arranged  so  that  each  fractional  area 
of  the  character  is  scanned  successively.     The  system  thus  subdivides  the  patterns  into  a  series  of 
minute  areas,   and  it  is  claimed  that  the  scanning  "is  greatly  refined  in  its  exactness,   as  compared 
with  a  method    of  scanning  the  whole  of  the  indicia".     However,   the  actual  comparison-matching, 
which  is  based  upon  a  net  difference  balancing  of  the  instantaneous  current  values  of  the  two  photo- 
cells,  is  effected  only  when  all  of  the  successively  scanned  sub-areas  of  the  input  and  reference 
characters  are  similar. 

In  the  template  matching  technique  proposed  by  McNaney,   a  'shaped  beam  tube',   also  patented 
by  McNaney,   is  used.  _§/  This  patent  was  awarded  in  1958,   and  has  been  assigned  to  the  General 
Dynamics  Corporation.     The  inventor  describes  the  system  as  follows: 

"The  invention  employs  illumination  or  shadowing  of  printed,   punched  or  typed  images 
from  paper,   or  the  like,   onto  a  light   responsive  screen  of  a  cathode  ray  tube.     The  cathode 
ray  tube  so  employed,    is  capable  of  displaying  a  plurality  of  like  predetermined  images 
electronically  upon  its  target  or  screen.     The  screen  or  target  of  the  tube  is  provided  with 
a  conductive  layer  disposed  directly  upon  the  tube,   and  has  a  light  responsive  layer,    such  as 
a  photoconductive  material,   overlaid  upon  the  conductive  layer.     Therefore,   as  an  image  is 
optically  projected  through  the  conducting  layer  onto  the  light  responsive  layer,   the  light 
responsive  layer  will  become  conducting  in  the  areas  illuminated  by  the  image.     When, 
therefore,   the  like  image  is  projected  by  the  electron  beam  of  the  cathode  ray  tube,   there 
is  established  a  condition  of  coincidence  between  the  images  and  also  of  equilibrium  in  an 
external  circuit,   there  being  no  current  flow  at  that  moment  from  the  electron  beam  through 
the  conducting  layer  to  a  means  external  of  the  tube  which  is  capable  of,    circuitwise, 
establishing  coincidence  between  the  coded  information  furnished  to  the  tube  to  project  the 
character  and  at  the  same  time  present  that  coded  information  to  an  output.     Therefore, 
only  upon  coincidence  of  the  images  at  the  target,    will  an  output  be  generated  which  is  the 
same  output  as  that  generated, by  a  code  generator  for  presentation  of  the  image  by  the 
cathode  ray  tube.     This  system  then  provides  a  very  simple  and  trouble-free  image-to-code 
conversion  system.  " 

A  somewhat  similar  technique  to  that  of  McNaney,   also  using  an  area-preserving,    shape -dependent, 
position-dependent,   template  matching  principle  is  discussed  in  British  Patent  83,  326.     This 
invention  involves  projection  of  the  input  pattern  onto  an  insulating  plate  with  an  arrangement  of 
light-sensitive  resistors,    each  connected  to  a  specified  bridge,    such  that  different  combinations  of 
resistors  are  affected  for  each  different  character  of  the  vocabulary. !_' 

The  template  matching  techniques  for  pattern  comparison  processing  typically  require  that  only 
the  predetermined  characters  of  a  closed  vocabulary  can  be  recognized.     The  pattern  comparison 
processing  operations  of  either  decision-tree  or  parallel  processing  types,   on  the  other  hand,   are 
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typically  applied  to  analytic  input  patterns  or  to  input  patterns  comprised  of  selected  criterial 
features.     These  comparison  techniques  may  therefore  accommodate,   to  a  greater  or  lesser  degree, 
character  pattern  variants  that  are  similar  to,   but  not  necessarily  identical  with,   the  characters  of 
the  reference  pattern  vocabulary. 

The  distinction  between  decision-tree,   or  sequential,   processing  of  the  input-to-reference 
pattern  matching  or  subsequent  identification  formula  matching,   and  parallel  processing,   has  been 
stressed  by  many  workers  in  the  fields  of  automatic  character  recognition  and  pattern  recognition 
research.     For  example,   both  Minsky  and  Neisser  emphasized  differences  between  these  two 
approaches  in  a  panel  discussion  of  pattern  recognition  problems  at  the  Eastern  Joint  Computer 
Conference,    December  1959-  JV  Selfridge  and  Neisser  have  provided  a  simplified  description  of  these 
differences  as  follows: 

"There  are  two  fundamentally  different  possibilities:    sequential  and  parallel  processing. 
In  sequential  processing  the  features  are  inspected  in  a  predetermined  order,   the  outcome  of 
each  test  determining  the  next  step.     Each  letter  is  represented  by  a  unique  sequence  of  binary 
decisions.     To  take  a  simple  example,   a  program  to  distinguish  the  letters  A,   H,   V  and  Y 
might  decide  among  them  on  the  basis  of  the  presence  or  absence  of  three  features:    A  concavity 
at  the  top,   a  crossbar  and  a  vertical  line.     The  sequential  process  would  ask  first:     'Is  there  a 
concavity  at  the  top?'  If  the  answer  is  no,   the  sample  is  A.     If  the  answer  is  yes,    the  program 
asks,    'Is  there  a  crossbar?'    If  yes,   the  letter  is  H;    if  no,   then:     'Is  there  a  vertical  line?' If 
yes,   the  letter  is  Y;  if  no,   V  .  .  .  . 

"In  parallel  processing  all  the  questions  would  be  asked  at  once,   and  all  the  answers 
presented  simultaneously  to  the  decision-maker  .  .  .    Different  combinations  identify  the 
different  letters.     One  might  think  of  the  various  features  as  being  inspected  by  little  demons, 
all  of  whom  then  shout  the  answers  in  concert  to  a  decision-making  demon.     From  this  conceit 
comes  the  name  'Pandemonium'  for  parallel  processing.  "JV 

Examples  of  systems  using  a  decision-tree  procedure  for  pattern  comparison  and  identification 
formula  matching  include  various  Farrington-IMR  machines,   Glauberman,   Fitch,   Greanias  et  al  in 
the  proportional  parts  method,   and  Bomba,   among  others.     Quite  commonly,    several  parallel  paths 
through  the  decision-tree  structure  may  be  provided  for  the  same  character.     Thus,    Rochester  notes 
that  a  '4'  with  serifs,   a  '4'  without  serifs,    and  a  pica  '4'  will  be  processed  through  a  logic-tree 
structure  for  the  presence  or  absence  of  specified  criterial  features  as  though  each  were  a  separate 
and  distinct  character.     However,   the  final  result  will  be  a  single  target  pattern  output  representing 
'4'.  _'  The  use  of  parallel  paths,    such  as  for  example  in  the  proportional  parts  method  of  Greanias 
et  al,    may  in  some  cases  provide  a  means  of  minimizing  one  of  the  disadvantages  often  associated 
with  sequential  processing  --  that  is,   the  fact  that  the  technique  usually  requires  each  of  the  series 
of  tests  to  be  satisfied  in  a  fixed  order  and  that,   therefore,    the  final  decision  rests  upon  the  outcome 
of  the  worst  or  weakest  test. 

Bomba  observes  that  with  the  decision-tree  structure  of  his  criterial  feature  analysis  recog- 
nition logic,    multiple  paths  do  result  from  various  branching  possibilities.     He  concludes,   therefore, 
that  some  character  pattern  redundancy  must  be  retained  in  the  system.    4/    On  the  other  hand, 
related  pattern  recognition  research  efforts,    such  as  those  of  Gilljyor  Glovazky,    6/ are  concerned 
not  only  with  minimization  of  redundancy  in  the  derivation  of  the  input  pattern  from  the  unknown  source 
pattern  character,   but  also  with  eliminating  redundancy  in  the  comparison  processing,   by  determining 
minimal  paths  through  decision-tree  structures.     These  efforts  may  be  concerned  with  techniques 
such  as  those  of  criterial  area,    'peephole',   templates,    or  with  other  relatively  invariant  properties, 
such  that,    so  far  as  possible,    eachproperty  test  should  successively  bifurcate  or  dichotomize  the 
portion  of  the  reference  pattern  vocabulary  remaining  to  be  searched. 
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Among  other  pattern  recognition  research  efforts  related  to  the  use  of  decision-tree  comparison 
matching  processes  are  those  of  Unger  and  Blokh.     Blokh  discusses  requirements  for  determining  an 
optimum  sequence  for  testing  input  pattern  elements  as  a  matter  of  maximizing  the  accretion  of 
information.     He  also  considers  the  possibility  of  utilizing  independent  information  with  respect  to  the 
probabilities  of  occurrence  of    various  different  source  patterns,   based  upon  the  'greatest  conditional 
information1  given  that  the  previous  decision-step  element  was  black  or  was  white.  .1/ 

Unger  notes  that: 

"It  is  not  necessary  to  ask  all  of  the  questions  in  every  case.     A  sequential  process  can 
be  carried  out  in  which  the  answer  to  the  first  question  determines  a  subset  of  the  alphabet 
to  which  the  input  feature  may  belong.     Depending  upon  the  subset,   a  second  question  is  asked, 
the   answer  to  which  further  narrows  the  field  of  possible  categories.  "    2/ 

Unfortunately,   however,    'ideal'  areas  or  other  properties  that  would  enable  a  minimization  of 
decision-branch  processing  and  a  redundancy-elimination  approximating  the  information-theoretic 
limits  are  not  generally  found  in  real-life  character  vocabularies. 

In  contrast  to  decision-tree  logic  as  applied  to  matching  and  comparison  operations,   the  parallel 
processing  techniques  make  all  tests,    but  without  requiring,   in  the  identification  formulas,   particular 
sequences  of  hits.      Moreover,   in  most  of  the  proposals  suggesting  this  technique,   not  all  tests  need 
be  used  for  a  given  identification.     Instead,   a  majority-vote  type  of  decision  wins.     Perhaps  the  most 
detailed  example  of  the  parallel  processing  techniques  is  that  of  Doyle  with  respect  to  the  28  criterial 
features  used  for  the  computer -simulation  of  recognition  of  handprinted  alphabetic  characters.     Doyle 
stresses  the  distinction  between  sequential  decision-tree  processing  and  the  parallel  approach  as 
follows : 

"In  any  recognition  scheme  the  raw  data  are  transformed  or  tested  and  from  the  results 
decisions  are  made.     Usually  the  process  consists  of  a  sequence  of  tests  together  with  rules 
for  branching  after  each  test,   the  final  branch  furnishing  the  decision.     Any  recognition 
method  employing  such  a  sequence  of  decisions  to  reach  an  ultimate  verdict  is  evidently 
limited  by  the  weakness  of  its  worst  test  .  .  .   Unless  the  tests  are  good  indeed  or  the  samples 
to  be  recognized  quite  free  of  noise  and  distortion  a  parallel  processing  technique  seems 
preferable.  "J?/ 

We  should  note  that  some  confusion  may  arise  on  this  point  due  to  the  use  of  the  'processing' 
terminology  in  contrasting  sequential  with  parallel  methods.     The  contrast  is  more  properly  between 
the  decision-tree  matching  procedure  in  which  each  next  test  is  dependent  upon  the  results  of  a  prior 
test  or  pattern-element-matching -result,   and  a  procedure  in  which  all  tests  are  made  for  each  input 
pattern  and  decision  as  to  identification  is  based  on  all  outcomes.     The  latter  procedure  may  in  fact 
be  carried  out  in  a  step-by-step  manner,    sequentially. 

Unless  a  given  sequential  ordering  of  test  results  is  required  in  a  particular  identification- 
formula  system,    sequential  in  contradistinction  to  parallel  processing  involves  rather  a  matter  of 
time-to-recognize  or  of  cost  than  a  matter  of  pattern  recognition  principle.     Moreover,   as  we  have 
noted,   a  multiple -path  decision-tree  .logic  does  not  necessarily  suffer  the  disadvantage  of  reliance 
upon  the  weakest  test,   even  though  the  tests  which  determine  the  decision-criteria  are  carried  out 
sequentially.     Conversely,   a  parallel-processing  system  which  requires  an  all-or-nothing  basis  for 
decision-making  does  suffer  precisely  this  disadvantage.     Usually,  however,   a  decision-tree 
procedure  is  carried  out  with  sequential  processing  during  the  matching  operations,    whereas  the 
'Pandemonium'  principle  may  be  implemented  by  either  sequential  or  parallel  processing  in  the 
actual  matching  and  testing. 

6.  3    Bases  for  Recognition-Identification  Decisions 

Just  as  is  the  case  for  scanning  and  input  pattern  processing  and  for  pattern  comparison 
processing,   a  three-way  differentiation  can  also  be  made  with  respect  to  the  basis  for  recognition- 
identification  decisions  used  in  various  operating  or  proposed  character  recognition  systems.     These 
commonly  used  bases  are:     (1)    all-or-nothing,    such  as  extinction  of  light  to  a  photocell;  (2)  the  'best- 
fit'  or  closest  match  which  allows  some  variability  with  respect  to  the  exactness  of  match  between  the 
input  pattern  and  the  reference  patterns  or  between  the  observed  identification  formula  and  the  master 


—'•         Blokh,    E.  L.     "The  question  of  the  minimum  description,  "  Ref .    53. 

2/ 
— '         Unger,   S.  H.     "Pattern  detection  and  recognition",   Ref.    501,   p.    1746. 

3/ 
— '  Doyle,   W.     "Recognition  of  sloppy,   hand-printed  characters",   Ref.    104,   p.    103. 
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formula;  and  (3)  the  synthetic,   where  the  probability  that  a  given  input  pattern  is  in  fact  a  given 
character  of  the  Vocabulary  may  be  determined  by  various  factors. 

The  all-or-nothing  basis  for  identification  is  found  not  only  in  the  early  optical  template  systems, 
but  also  in  certain  analytic  or  criterial  input  processing  techniques  where  the  identification  is  effected 
by  truth-table  or  decoding  matrices  in  which  one  and  only  one  output  will  be  energized.     The  obvious 
disadvantages  of  the  exact  match  requirements  for  early  photographic  template  techniques  due  to 
light  leakage  from  even  minor  misregistration  or  from  imperfections  in  blackness  of  character  strokes 
led  rapidly  to  the  development  of  techniques  for  integration  over  small  sub-areas  and  to  'best-fit' 
determination  techniques.     Similar  phenomena  with  respect  to  varying  densities  of  ink  deposited  as 
well  as  character  imperfections  arise  even  in  the  magnetic  ink  recognition  systems. 

A  variety  of  means  have  therefore  been  invented  for  better  correlation  of  'best-fit'  decisions  in 
the  case  of  the  characteristic  waveform  recognition  techniques.     For  example,   improvements  in  the 
ERMA-type  magnetic  ink  recognition  systems  have  been  developed  by  General  Electric.  J/   Similar 
improvements  have  also  been  developed  for  waveforms  derived  by  optical  scanning  devices,   as  in  a 
Glauberman  patent  assigned  to  the  Laboratory  for  Electronics.  _2/  Dickinson  considers  a  system  using 
matched  filters,   in  which,   in  effect,  vectors  are  formed  from  the  sampled  input  waveform  and  from 
the  reference  pattern  waveform  so  as  to  obtain  the  scalar  product  of  the  two.     Thus,    Dickinson  claims: 

"It  is  clear  that,    if  the  scalar  products  of  the  unknown  vector  with  a  set  of  unit-length 
compare  vectors  are  obtained,   the  compare  vector  with  the  smallest  angular  separation  from 
the  unknown  vector  will  yield  the  largest  scalar  product.     If  the  unknown  vector  increases  in 
length,   all  scalar  products  increase  proportionately.     Thus,   by  taking  the  ratio  between  any 
two  scalar  products,   a  means  of  eliminating  the  effect  of  amplitude  variations  of  the   unknown 
vector  is  obtained.  "  3/ 

Minot,   in  his   1959  survey  of  automatic  character  recognition  developments,   considers  three 
types  of  the  'best-fit'  basis  for  recognition-identification  decision.    J*/  These  are:    (1)    partial 
matching  by  location,   in  which  more  than  a  required  percentage  of  the  observed  states  of  a  quantized 
input  pattern  are  found  to  be  equal  to  the  prescribed  states  for  the  locations  considered;     (2)  tolerance 
in  gray  match,   where  all  the  observed  states  are  either  equal  or  close  to  the  values  of  all  the 
prescribed  states;   and  (3)  matching  at  critical  locations,   to  which  we  would  add  the  alternative  of 
matching  criteria!  features.     This  last  type  of  best-fit  criterion  in  the  recognition-identification 
decision  process  obviously  includes  the  case  where  it  is  effected  by  preliminary  determination  of  in- 
different locations  as  in  the  techniques  described  by  Evey    j>/and  Wada.    6/ 

Best-fit  decision  rules  are  also  important  in  systems  where  the  recognition-identification  is 
based  upon  significant  properties  or  features  that  are  relatively  invariant  for  different  presentations 
of  character  configurations  that  are  to  be  considered  the  'same'  character,   in  other  words,   a 
character  class.     In  this  sense,   as  Singer  has  noted,   the  'best  fit'  can  become  ".  .  .   the  worst  fit  for 
a  set  of  images  which  does  not  overlap  with  a  different  set  of  images.  "  ]_/     Farley  discusses  this 
problem  with  respect  both  to  human  and  machine  recognition  of  patterns,   and  notes  certain  implica- 
tions for  future  research,   as  follows: 

"More  should  be  said  about  the  nature  of  the  rules  which  determine  the  'best  fitting' 
class  when  comparison  of  classes  is  being  made  with  incoming  data.     The  exact  nature  of 
good  rules  of  this  type  remain  to  be  investigated,   but  it  seems  clear  that  the  correlation 
should  involve  a  threshold  which,    if  exceeded,    would  indicate  the  choice  of  the  appropriate 
class.     A  very  simple  rule  would  be  majority  vote  of  properties  common  to  class  and  input, 
perhaps  with  some  properties  weighted  more  heavily  than  others.     Note  that  correlation  rules 
and  thresholds  need  not  be  constant,   and  those  property-classes  under  active  consideration 
may  also  change  with  external  conditions.     Such  changes  could  account  for  the  psychological 
phenomena  of  'set'  or  attention,   and  motivation.      'Context'  can  exert  its  influence  in  the  same 
way- -correlation  thresholds  for  example,    can  change  with  time  or  space  'surroundings'.  "  £/ 


1/ 

1/ 
1/ 
i/ 
1/ 

1/ 
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See  for  example,    Eldredge,   K.R.,    M.  D.  Marsh.     U.  S.  Patent  2,  961,  649,   Ref.  110;  Merritt, 

P.  E.  ,   et  al.     U.S.   Patent  2,  924,  812,    Ref.  304;  Elbinger,    L.  P.   U.  S.  Patent  2,  927,  303,  Ref.  108. 

Glauberman,   M.  H.  ,   et  al.   U.  S.  Patent  2,  947,  971,   Ref.  164. 

Dickinson,    W.  E.  ,    "A  character  recognition  study";    Ref.    95,   pp.    337-338. 

Minot,    O.N.    "Automatic  devices  for  recognition  ..."  Ref.    310. 

Evey,    R.  J.     "Character  recognition  logic  design",    Ref.    126. 

Wada,   H.  ,   et  al.    "An  electronic  reading  machine",   Ref.    521. 

Singer,   J.R.    "A  self -organizing  recognition  system",    Ref.  430,   p.  545. 

Farley,    B.G.    "Self-organizing  models  for  learned  perception",    Ref.  131,   p.    19. 
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In  the  above  quotation,   both  the  best-fit  and  the  synthetic  bases  for  recognition-identification 
decision  are  implied.     That  is,   what  we  have  termed  a  synthetic  basis  is  one  which  constructs  a 
probabilistic  decision  criterion,   frequently  involving  considerations  such  as  context  which  are  external 
to  the  input  pattern  per  se.     For  example,   the  threshold  for  final  identification  in  one  of  the  Bledsoe- 
Browning  experiments  varies  with  the  word  context  in  such  way  that  the  output  or  target  pattern 
character  selected  ma/  actually  have  had  a  lower  absolute  'score1  from  the  matching  process  than 
another  of  the  reference  characters  in  the  vocabulary.     ±J 

Another  factor  that  may  be  considered  in  a  synthetic  identification  decision  is  that  of  expected 
character  frequency.     That  is,   where  the  results  of  input-pattern-to-reference-pattern  matching  are 
ambiguous,   final  choice  may  be  made  on  the  basis  of  the  character  that  occurs  more  frequently,   for  a 
given  language  and  subject  matter  source.     An  example  of  considerations  of  this  type  is  reported  by 
Blokh.    2/    Chow  considers  the  case  where  previously  determined  noise  statistics,    specifically 
including  the  ways  in  which  various  characters  and  noise  are  frequently  combined,   are  used  to  deter- 
mine the  conditional  probability  density  with  respect  to  a  corrupt  and  noise  input  pattern.    3/  Moreover, 
by  associating  the  derived  conditional  probabilities  with  estimates  of  loss  or  risk  involved  in  the 
substitution  of  a  specific  character  for  another,    Chow  suggests  a  means  to  optimize  the  recognition 
system  performance  by  minimizing  the  risk-cost. 

In  general,   the  synthetic  basis  for  character  recognition  decision  provides  principles  that  are  in 
marked  contrast  to  those  usedfor  either  all -or -nothing  or  best-fit  techniques.     This  contrast  reflects 
important  distinctions  that  have  been  made  between  pattern  (and  character)  detection  and  pattern  (and 
character)  recognition.  2.1     Metzelaar  distinguishes  between  systems  that  sort  different  input  patterns 
into  preassigned  classes,   where  the  possible  patterns  that  may  occur  are  known  in  advance,   and 
systems  where  it  is  necessary  to  determine  features  found  in  common  for  various  subsets  of  a  set 
of  observed  patterns.  2.1  Recognition  techniques  that  employ  synthetic  decision  criteria  belong  to  the 
latter  category.     The  contrast  can  be  equated,    in  large  part,   with  the  typical  differences  in  objectives 
and  approach  as  between  designers  of  equipment  and  researchers  concerned  with  general  problems  of 
pattern  recognition.     However,   the  contrast  is  also  related  to  prospects  for  further  progress, 
specifically  with  respect  to  design  characteristics  that  would  allow  greater  flexibility  and  open-ended 
vocabularies. 

i    A    General  Characterization  and  Relative  Advantages  and  Disadvantages  of  Selected  Systems 

The  various  categories  of  scanning  and  input  transformation  processing,   pattern  comparison 
processing,   and  bases  for  recognition-identification  decision,   are  combined  in  different  ways  in 
various  specific  systems.     Table  IV  gives  examples  for  some  of  the  possible  combinations  as  found 
in  certain  operating  or  proposed  systems. 

Other  methods  of  characterizing  the  overall  system  design  of  various  character  recognition 
proposals  include  those  based  on  the  type  and  extent  of  information-destructive  transformations  that 
are  performed.     These  would  range  from  little  or  no  reduction  of  the  original  information  of  the 
source  pattern  to  the  highly  reductive  criterial  feature  extraction  and  analysis  systems.     Kamentsky, 
for  example,    classifies  systems  on  this  basis  as  belonging  to  the  following  categories:     (1)  'element 
matching',    comparable  to  coordinate  grid  template  systems;  (2)  'feature  matching';  (3)  'searching  for 
features',    which  includes  both  curve-tracing  and  vector  crossing  techniques;  (4)  'feature  extraction  by 
spatial  transformations',   which  would  include  both  the  Bomba-type  local  operations  techniques  and 
'blobbing',   and  (5)  'spatial  computers',    involving  the  use  of  devices  to  provide  feature  extractions  and 
other  local  operations  by  parallel  processing.    °J 


1/ 

2/ 

4/ 


V 

6/ 


Bledsoe,    W.W.  ,   and  I.  Browning.    "Pattern  recognition  and  reading  by  machine",   Ref.    51,    see 
also  pp.    97-98  of  this  report. 

Blokh,    E.  Li.     "The  question  of  the  minimum  description",    Ref.    53. 

Chow,    C.K.    "An  optimum  character  recognition  system  using  decision  functions",    Ref.    74, 
pp.    122-123. 

Unger,   for  example,   applies  the  term  'recognition'  to  operations  in  which  the  input  pattern  is  to 
be  identified  as  equivalent  to  some  one  of  several  known  reference  patterns.     Pattern  'detection', 
on  the  other  hand,    is:     "the  process  of  examining  a  set  of  figures  and  selecting  those  that  fall 
into  some  particular  class  of  patterns.  .  .  ■"  (Unger,   S.H.    "Pattern  detection  and  recognition", 
Ref.    501,   p.  1737. 

Metzelaar,    P.    "Mechanical  realization  of  pattern  recognition",    Ref.  305,   p.    4. 

Kamentsky,    L.A.    "Pattern  and  character  recognition  systems.  ..  ",    Ref .  244,   pp.    305,    306. 


103 


In  a  survey  conducted  by  Budd  Electronics.  J-/  primarily  concerned  with  the  more  general 
problem  of  pattern  recognition  as  in  the  requirements  for  target  identification  from  aerial  photo- 
graphic data,   distinctions  are  made  between  non-abstractive  and  abstractive  techniques,   and  between 
non-adaptive  and  adaptive  systems.     The  non-adaptive  techniques,   which  are  employed  in  most  of  the 
reader  systems  that  have  been  actually  demonstrated  to  date,   follow  fixed  comparison  processing 
steps  and  use  predetermined  recognition  criteria  as  bases  for  identification  decision.     The  non- 
abstractive  techniques,   in  addition,   provide  recognition  criteria  that  result  from  relatively  straight- 
forward processes  of  direct,   all-or-nothing  matching  or  of  best-fit  correlations.     In  the  abstractive 
techniques,   which  we  have  previously  considered  under  the  term  'criterial',    selected  properties  that 
can  be  measured  are  extracted  both  from  reference  and  input  patterns,   and  the  measures  of  these 
properties  provide  the  basis  for  matching.     Finally,   the  adaptive  systems  are  those  in  which  no 
specific  recognition  criteria  are  preset.     Instead,   the  system  in  effect  'learns'  from  experience  with 
many  samples  to  discriminate  common  patterns.     Thus  it  is  considered  that  adaptive  recognition 
systems  properly  belong  to  the  general  category  of  self-organizing  systems.     Such  distinctions  re- 
emphasize  the  differences  we  have  observed  between  development  of  automatic  reading  equipment 
and  research  investigations  of  pattern  recognition  and  pattern  detection. 

Various  schemes  for  the  comparative  characterization  of  different  systems  often  reflect  pre- 
suppositions as  to  the  relative  advantages  or  disadvantages  of  particular  techniques.     In  the  absence 
of  operational  usage  data  for  most  systems,  however,    such  supposed  advantages  or  disadvantages 
are  clearly  related  to  different  objectives  and  different  possible  applications.     Uhr,   for  example, 
identifies  optical-template  and  atomistic -matching  (coordinate  description  template)  approaches  as 
"simple-minded  and  fallible"  and  considers  the  analytical  and  criterial  techniques  to  be  both  more 
reasonable  and  more  promising.  Jj    Similar  conclusions  are  reached  by  others  who,   like  Uhr,   are 
primarily  concerned  with  problems  of  pattern  recognition.  3/    Where  the  problems  of  character 
recognition  are  extended  to  the  case  of  handwritten  and  handprinted  source  patterns,   the  assumed 
disadvantages  of  the  template  methods  are  even  more  strongly  indicted.     For  example,   Sherman 
describes  the  typical  difficulties  as  follows: 

"The  use  of  templates  fails  for  hand-printed  character  recognition  where  writers 
are  permitted  freedom  of  size,   location,    slant,    rotation  and  other  characteristics  which 
personalize  hand-printing.     One  extension  of  the  template  technique  to  overcome  these 
difficulties  would  be  the  use  of  two-dimensional  correlation  or  Fourier  analysis.     Here 
the  kernel  of  the  correlation  integral  should  include  a  number  of  independent  parameters 
whose  value  is  varied  until  maximum  correlation  with  the  distorted  master  is  achieved. 
To  account  only  for  linear  distortions  would  require  that  each  character  be  matched 
against  the  entire  file  of  masters,   in  which  each  master  has  been  given  an  allowable  range 
of  magnification  in  each  dimension,   of  translation  in  x  and  y  directions  of  horizontal  and 
vertical  shear,   and  of  movement  of  the  center  and  angle  of  rotation.     Unfortunately  the 
number  of  allowable  combinations  of  these  parameters  and  the  range  of  values  discourages  one 
from  the  use  of  correlation  techniques.     Since  the  individual  distortions  of  hand-printing 
include  non-linear  as  well  as  linear  parameters,   the  necessity  for  some  other  recognition 
techniques  becomes  apparent.  "  4/ 

On  the  other  hand,   for  many  practical  character  reading  applications,   template  matching 
systems  provide  advantages  of  speed,    simplicity,    economy,    ease  of  maintenance.     Optical  templates 
such  as  photographic  negative  reference  patterns  tend  to  mask  out  both  background  noise  and 
fragments  of  other  characters  which  may  overlap  the  source  pattern  image  field,    provided  there  is 


y 

2/ 

3/ 


4/ 


Rosenfeld,  A.     private  communication,   February  1961. 

Uhr,    L.      "Intelligence  in  computers:    the  psychology  of  perception  in  people  and  in  machines,  " 
Ref.  494,   pp.    179,    180. 

Compare,   for  example,   the  following:     "...    Programs  for  pattern  recognition  are  relatively 
successful  when  original  preparations  of  the  patterns  are  strictly  controlled.     These  programs 
are,   however,    relatively  incompetent  to  identify  patterns  within  simple  transformations.     To 
solve  this  problem  of  form  distortion,   a  more  general  logic  for  information  extraction  and 
identification  is  needed.     Preferably,   this  general-logic  should  not  be  one  that  is  based  upon  a 
specific  simplified  problem  of  identifying  points  of  contrast  in  a  small,   unambiguous  array"  -- 
Research  Directorate  Quarterly  Report  No.  1,   System  Development  Corporation,   Ref.  469, 
p.  45. 

Sherman,    H.     "A  quasi-topological  method  for  machine  recognition  of  line  patterns",    Ref.  421, 
p.   233.. 
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uniform  spacing  per  character  in  any  line  and  that  the  registration  of  the  first  character  in  the  line  can 
be  determined  with  sufficient  accuracy  to  permit  control  of  the  scanning  of  the  rest  of  the  line. 

Various  techniques  are  available  which  permit  the  simultaneous  matching  of  the  input  pattern  to 
a  number  of  reference  patterns  in  parallel,    without  moving  parts  or  complicated  logic  circuitry. 
Various  mechanisms  may  be  employed  to  allow  shifting  from  one  set  of  vocabulary  reference  patterns 
to  another  if  the  total  vocabulary  contains  more  characters  than  can  be  matched  in  a  single  parallel 
comparison  step.     Such  parallel  matching,    coupled  with  best-fit  decision-making,    solves  the  problems 
of  ambiguity  where  the  template  for  one  character  is  included  in  (covered  by)  the  template  for  another. 
The  template  systems,   however,   have  inherent  limitations  as  to  the  allowable  size  and  the  require- 
ments for  prior  specification  of  the  vocabulary,    and  they  are  particularly  susceptible  to  nonrecognition 
or  to  misrecognition  where  the  source  patterns  are  frequently  mutilated  and  degraded. 

In  general,    then,    we  find  that  devices  intended  for  limited  special-purpose  applications,    for 
closed  vocabularies  of  from  20-200  characters,    or  for  situations  where  consistently  high  quality  input 
can  be  expected,    require  a  less  sophisticated  logic  and  often  offer  higher  speeds  of  recognition  than 
do  more  general-purpose  systems.     The  more  complicated  logics,   however,   that  are  suitable  for 
larger  vocabularies,   that  provide  reasonable  flexibility  in  accommodating  a  variety  of  type  styles, 
and  that  can  handle  relatively  poor  quality  source  patterns,    require  additional  hardware  at  increased 
expense  and  usually  with  considerably  increased  total  size  and  complexity  of  equipment.     Thus 
questions  of  relative  advantages  and  disadvantages  of  various  techniques  and  different  systems  are 
clearly  related  to  specific  operational  requirements  for  specific  proposed  applications  or  particular 
objectives,    whether  these  are  those  of  research  investigations  or  those  of  time-saving  economy. 

7.     FURTHER  PROSPECTS  FOR  CHARACTER  RECOGNITION  DEVELOPMENT 

Research  and  development  efforts  are  being  actively  pursued  by  a  number  of  organizations  and 
individuals  interested  in  the  practical  realization  of  automatic  character  reading  techniques.     These 
efforts  range  from  those  which  have  to  do  with  basic  questions  of  pattern  recognition  and  pattern 
perception  to  those  which  are  concerned  with  detailed  techniques  for  reducing  the  redundancy  of 
information  or  process  steps  and  with  the  determination  of  minimal  paths  through  particular  decision- 
tree  logics. 

We  shall  consider  in  this  section  those  efforts  which  appear  to.  be  directly  related  to  the  develop- 
ment or  improvement  of  practical  character  recognition  systems.     For  example,    statistical  sampling 
studies  of  variations  in  source  pattern  presentations  of  the  same  character,    whether  based  on 
manual  _£/  or  machine  observation  and  test,   are  in  this  category  if  the  objective  is  the  design  or 
modification  of  a  specific  recognition  logic  or  the  design  of  an  improved  font. 

In  particular,    promising  areas  for  further  development  of  character  recognition  systems  for 
large  vocabularies  (such  as  foreign  language  and  script  materials)  include  machine  simulation  of 
proposed  recognition  processes,    the  development  of  self-adjusting  and  self-setting  systems,    and  the 
use  of  context  for  improved  recognition-identification  of  both  characters  and  words.     Each  of  these 
areas  is  discussed  briefly  below. 

7.  1    Machine  Simulation 


Digital  computers  have  been  used  to  simulate  the  performance  of  a  proposed  reading  system,   to 
test  the  proposed  recognition  logic  with  sample  characters,    and  to  adjust  recognition  sequences 
(identification  formulas)  for  maximum  discrimination.     In  addition,   the  coupling  of  a  suitable  scanning 
system  to  a  digital  computer  can  provide  the  basis  for  systematic  studies  of  the  performance  of 
specific  recognition  logics  for  varying  degrees  of  skew  of  source  patterns  or  for  varying  degrees  of 
character  deterioration  in  the  source  material.     Such  a  system  also  provides  means  for  study  of  the 
incidence  of  random  noise  about  symbols  of  varying  design  so  as  to  optimize  the  shapes  of  characters 
to  be  used  in  a  standardized  font. 

Greanias  and  others  at  IBM  have  made  extensive  use  of  computer  simulation  for  design  of 
optical  character  recognition  logics,    for  development  of  improvements  in  magnetic  character  reading 
systems,    and  for  determination  of  typical  noise  and  degradation  characteristics  for  representative 
input  material.  _?/  In  the  development  of  the  proportional  parts  method  for  recognition,    the  use  of 


1/ 


2/ 


In  the  special  case  of  decision-tree  logic  disclosed  by  Fitch,   for  example,    subsequent  manual 
observations  can  be  used  to  modify,    by  pluggable  jumper  wire  connections,    particular  paths 
through  the  tree -network,    which  provides  both  multiple  paths  for  the  same  character  and  path- 
sharing  of  common  branch  points.     See  Ref.  135. 

Greanias,    E.  C.  ,    C.  J.   Hoppel,    et  al.      "Design  of  logic  for  recognition  of  printed  characters 
by  simulation",    Ref.  175;  Greanias,    E.  C.    and  Y.M.Hill.    "Considerations  in  the  design  of 
character  recognition  devices",    Ref.    172. 
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computer  simulation  allowed  testing  of  details  of  the  proposed  logics  for  adequacy  of  character  symbol 
differentiation.     These  tests  were  conducted  for  a  wide  range  of  quality  of  source  patterns  and  for 
different  vocabulary  sets.     The  use  of  the  650  computer  in  simulation  also  provided  a  means  for 
determining  the  specific  modifications  to  be  made  to  plugboard-wired  logic  for  a  new  or  modified 
vocabulary  set.     Continuing  use  of  the  computer  for  this  purpose  was  anticipated  in  order  to  provide 
accommodation  of  new  input  materials. 

Other  studies  of  typical  noise  characteristics  that  affect  the  required  scanning  resolution, 
recognition  and  identification  thresholds,   and  limits  of  character  discriminability,   which  were 
simulated  by  computer  experiments,   include  those  of  Glover,  _!/  Evey  £/   and  Weeks.    3/   Glover  used 
the  TX-0  Computer  at  M.I.  T.   to  test  a  vocabulary  of  13  characters  for  both  translation  and  noise. 
The  original  12x12  coordinate  description  templates  for  these  characters  were  subjected  to  machine- 
controlled  distortions  to  create  given  percentages  of  superimposed  random  noise.     When  these 
machine -manufactured  'unknowns'  were  placed  in  an  18x18  source  pattern  image  space,   a  12x12  input 
pattern  extraction  'window',    shifting  through  this  space,   enabled  a  series  of  comparisons  to  be  made 
for  all  the  translational  possibilities  and  for  specified  noise  distributions.     Data  from  the  analyses  of 
these  tests  led  to  criteria  for  thresholds  which,   if  exceeded,   indicated  that  good  registration  had  been 
achieved. 

Conversely,   where  obtainable  samples  of  vocabulary  characters  deviate  from  'ideal'  character 
patterns  because  of  ineradicable  noise,   machine  simulation  methods  may  be  used  to  generate  the  ideal 
characters  desired.     Flores  and  Ragonese,   for  example,    describe  a  computer  technique  for  deriving 
the  waveform  that  should  be  generated  by  a  noise-free  magnetized  ink  character.    4/   These  studies 
have  as  objectives  both  the  facilitation  of  system  design  and  the  determination  of  the  effects  of  such 
variables  in  the  source  patterns  as  chemical  composition  of  ink,   fiber  structure  and  absorbency  of 
the  paper  carrier,   and  the  like.     It  is  claimed  that  the  computer  synthesizing  procedure  enables  the 
designer  to  investigate  predictable  changes  in  waveforms  generated  by  characters  that  have  been 
altered  in  various  specific  ways. 

Developmental  research  efforts  directed  to  improvements  in  magnetic  ink  character  recognition 
systems  at  General  Electric,    Remington  Rand,   IBM,    and  elsewhere,   have  used  machine  simulation 
techniques  for  the  design  of  improved  fonts,   for  improvements  in  means  to  detect  particular  signals 
in  the  presence  of  noise,   and  to  test  proposed  techniques.     Dickinson  in  particular  emphasizes  the 
contribution  of  the  use  of  the  computer  as  an  aid  in  the  design  of  the  type  font,    given  that  particular 
components  and  techniques  are  to  be  used.  — '    Jakowatz,   Shuey,   and  White  suggest  not  only  machine 
tests  of  many  samples  but  also  an  experience -adapting  storage  of  waveforms  to  be  used  as  reference 
patterns.     They  describe  their  techniques,    in  part,   as  follows: 

"The  input  consisted  of  randomly  occurring  fixed  waveforms  buried  in  additive  Gaussian 
noise  of  known  bandwidth.     No  prior  knowledge  of  the  time  occurrence  or  the  shape  of  the 
waveform  was  known  to  the  system.      After  a  sufficient  length  of  time,    the  system  developed 
in  its  memory  an  approximation  to  each  of  the  fixed  waveforms.     Furthermore,   it  gave,    with 
high  probability,   an  indication  each  time  one  of  the  fixed  waveforms  occurred  .... 

"If  several  adaptive  filters  are  properly  interconnected,   the  resulting  system  is  capable 
of  separating  several  non-overlapping  signals  buried  in  noise.     In  such  an  interconnection 
there  is  associated  with  each  adaptive  filter  a  memory  for  storing  one  waveform.     The  various 
waveforms  are  separated  into  their  respective  memories  by  means  of  inhibition  circuitry 
associated  with  each  threshold  .  .  .    The  memory  with  the  highest  correlation  with  the  incoming 
waveform  is  the  only  memory  that  is  altered  by  that  waveform.  W 

"To  date  we  have  successfully  demonstrated  an  adaptive  filter  capable  of  selecting 
invariant  waveforms  from  a  background  of  Gaussian  noise.     The  adaptive  filter  may  be 
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Glover,    E.  B.     "Simulation  of  a  character  recognition  system  utilizing  a  general  purpose 
digital  computer",    Ref.    167. 

Evey,   R.  J.     "The  use  of  a  computer  to  design  character  recognition  logic",   Ref.    126. 

Weeks,   R.W.      "Rotating  raster  character  recognition  system",   Ref.    528. 

Flores,   I.     and  F.   Ragonese.     "A  method  for  synthesizing  the  waveform  generated  by  a 
character,   printed  in  magnetic  ink,    in  passing  beneath  a  magnetic  reading  head",  Ref.    137. 

Dickinson,   W.  E.     "A  character-recognition  study.  "    Ref.    95. 

Jakowatz,    C.  V.  ,    R.  L.   Shuey,    G.  M.   White.     "Adaptive  waveform  recognition,  "    Ref.    239, 
pp.    1,    11. 
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viewed  as,  a  matched  filter  whose  characteristics  can  be  altered  by  altering  the  memory- 
associated  with  the  device.     With  experience,   the  contents  of  the  device's  memory- 
approaches  the  signal  being  sought.  .  .  "  jj 

Howard  has  described  the  use  of  computer  simulation  to  generate  and  test  various  recognition 
logics  with  respect  to  different  coordinate  descriptions  of  the  'same'  character  and  to  derive  optimum 
decision  rules,   most  useful  test  criteria,   and  the  like.     For  an  8x10  array,   approximately  200 
different  patterns  are  generated  for  each  character.     To  find  test  criteria  that  can  be  satisfied  for  all 
versions  of  a  given  character,   all  combinations  of  2-tuples  are  considered  for  4  inclusive-or  and  2 
exclusive-or  functions  of  the  2  variables.     Finally,    an  iterative  process  is  used  to  select  the  specific 
tests  that  are  most  useful  in  discriminating  between  the  set  of  representations  associated  with  one 
character  from  the  sets  for  other  characters. 

Statistical  studies  of  the  incidence  of  noise,   using  machine  processing  so  that  large  samples 
may  be  handled  and  so  that  effects  on  particular  logics  may  be  evaluated,   are  particularly  important 
for  the  development  of  automatic  reading  techniques  which  would  be  capable  of  coping  with  varied 
natural  language  materials.     For  potential  mechanized  translation  application,   for  example, 
diacritical  marks  used  in  the  alphabets  of  various  languages,    such  as  grave,    circumflex,   and  acute 
accents,   tilde,    dieresis,   vowel  point,    maybe  essential  to  discrimination  between  different 
inflectional  forms  of  words  or  stems  in  the  source  language  dictionary.     Similarly,    in  machine 
processing  of  natural  language  texts  in  linguistic  analyses,   automatic  abstracting,   and  the  like,   it 
may  be  essential  for  character  reader  input  devices  to  identify  the  punctuation  marks  as  well  as  the 
alphanumeric  character  symbols. 

Finally,   we  note  the  use  of  computer  programs  and  machine  simulation  for  investigation  and 
design  of  the  type  of  recognition  logic  which  is  based  upon  the  identification  of  unique  black  and 
unique  white  areas  for  each  symbol  in  the  vocabulary.     For  example,    design  for  the  matrix 
weightings  in  a  proposed  Philco  reader  was  developed  on  the  basis  of  simulations  on  the  Philco  2000 
computer.     Routines  developed  for  this  purpose  typically  compare  a  large  number  of  different 
character  symbols  (in  the  same  or  different  fonts  as  desired)  and  determine  the  parts  of  the  viewing 
raster  that  are  black  for  all  symbols  tested,    white  for  all  symbols,   or  sometimes  black  and  some- 
times white.     This  last  set  of  sub-areas  is  the  only  part  of  the  source  pattern  image  field  that  can  be 
used  to  discriminate  between  the  various  characters.     This  type  of  logic  is  obviously  sensitive  to 
rectilinear  translations  of  the  source  patterns  in  the  image  field.     To  overcome  this  difficulty,   other 
operations  can  be  used  that  calculate  the  center  of  gravity  of  any  input  pattern  image  and  then  shift 
the  derived  input  pattern  so  as  to  move  the  center  of  gravity  to  any  desired  position  with  respect  to 
the  reference  patterns  with  which  they  will  be  compared. 

Examples  of  computer  simulation  for  determining  such  significant  or  criterial  areas  are 
reported  in  papers  by  Wada  2/   and  Evey,   JV   among  others.     Stearns  _'      considers  the  case  where 
determination  of  comparative  discrimination  is  made  on  the  basis  of  binary  encoding  of  coordinate 
positions  as  either  black  or  white.     He  is  then  concerned  with  the  derivation  of  a  Boolean  product  for 
families  of  possible  coordinate  descriptions  of  the  same  character  pattern.     This  establishes 
possibilities  for  combining  several  variants  of  a  given  pattern  into  a  reduced  pattern,    discarding 
those  elements  which  for   that  given  pattern  might  be  either  black  or  white,    and  retaining  only  those 
that  must  be  either  white  or  black.     Stearns  used  the  704  computer  not  only  to  derive  such  reduced 
'output  equations'  (master  identification  formulas)  but  also  to  test  the  system  for  the  identification 
of  typewritten  numeric  characters. 

Computer  simulation  has  thus  been  used  for  the  design  of  fonts,   for  the  statistical  analysis  of 
probable  location  and  extent  of  noise  likely  to  occur  with  real-life  characters,   for  the  derivation  of 
criterial  areas  in  sampling -scan  or  weighted  area  coordinate  description  systems,   for  the 
identification  of  criterial  features,    for  the  'learning'  of  appropriate  recognition  formulas  on  the  basis 
of  a  number  of  samples  of  the  expected  character  population,   and  for  the  determination  of  minimum 
decision-making  paths  for  particular  recognition  logics. 

Machine  simulation  may  also  be  of  value  in  the  determination  of  other  design  parameters,    such 
as  the  optimization  of  clipping  levels  for  scanner  thresholds  for  a  given  range  of  relative  black  and 
relative  white  in  input  material,    and  the  like.  In  addition,    computer  simulation  has  been  extensively 
used  in  general  pattern  recognition  research. 
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Howard,    P.H.    "A  computer  method  of  generating  recognition  logics  for  printed  characters.  " 
Ref.   221. 

Wada,   H.    etal.     "An  electronic  reading  machine",    Ref.    521. 

Evey,    R.  J.     "Use  of  a  computer  to  design  character  recognition  logic,  "  Ref.    126. 

Stearns,    S.  D.     "A  method  for  the  design  of  pattern  recognition  logic",    Ref.    451. 
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7.  2    Self -Adjusting  and  Self-Setting  Systems 

Possibilities  for  improvements  in  automatic  character  reading  techniques,   in  terms  of  system 
self-adjustment,   include,   first,    a  variety  of  operations  designed  for  improvement  of  the  input  pattern 
prior  to  further  processing.     Such  improvement  may  consist  in  repositioning  as  a  result  of  feedback 
from  a  preliminary  scan,   by  movement  of  the  source  pattern  in  the  image  field,    or  by  movement  of 
the  scanning  mechanism,   or  both.     It  may  also  consist  of  finding  guides  or  reference  marks  to  control 
micropositioning.     Input  pattern  improvement  operations  that  are  involved  in  self-adjusting  systems 
may  also  consist  of  the  framing  or  masking  out  of  unwanted  portions  of  the  image  field,   once  the 
location  of    the  source  pattern  in  the  field  has  been  determined,    or  the  servoing  of  a  peephole -type 
matching  array  so  that  its  center  coincides  with  the  center  of  the  detected  source  pattern. 

Other  possibilities  in  self-adjusting  input  pattern  improvement  operations  reduce  the  redundancy 
of  the  information  to  be  scanned  and  identified.     Such  operations  include  cartooning  or  skeletonizing  of 
the  source  pattern  in  a    reductive  transformation  to  the  desired  input  pattern.     Similarly,    character 
strokes  may  be  automatically  thickened  to  an  average  width  for  a  given  line  of  characters,   once  the 
character  spacing  for  that  line  has  been  determined.     Input  improvement  operations  intended  to 
improve  the  quality  of  the  source  pattern  range  from  contrast  enhancement  to  a  variety  of  normaliz- 
ing processes  such  as  edge -smoothing,   integrating  over  stroke-body  areas,   and  filling  in  of  missing 
portions  by  statistical  techniques. 

Secondly,    character  frequency  probabilities  may  be  explored  and  applied  for  use  in  self- 
adjusting  systems.     Such  information  can  be  used  for  error  detection,    reject  reinstatement,   and 
interpolations  for  obviously  garbled  material.     The  use  of  letter  frequency  probabilities  for  English, 
Russian  and  other  language  material  to  fill  in  missing  or  nonrecognized  characters  of  an  input 
message,   or  to  provide  the  basis  for  choice  between  alternative  readings  of  an  ambiguous  character 
symbol,   would  be  of  particular  value  in  automatic  dictionary  or  mechanized  translation  applications 
of  reader  devices. 

Thirdly,   adjustments  for  variable  size  can  be  developed  for  reader  devices  such  that  the 
system  is  capable  of  readjusting  its  thresholds  or  its  input  pattern  transformations  in  accordance 
with  the  detected  characteristics  of  a  leading  character  symbol.     A  preliminary  step  in  this  direction 
was  achieved  in  an  early  Farrington-IMR  reader  for  Post  Office  envelope  sorting  applications.     This 
was  to  assume  that  the  first  character  in  an  address  destination  word  is  an  upper  case  symbol  and  to 
adjust  thresholds  for  subsequent  detection  of  lower  case  letters  having  ascenders  in  accordance  with 
this  detected  size.     Recent  Philco  developments,   also  for  the  Post  Office,    propose  self-adjustments 
in  the  scan  mode,   with  several  possible  tries  at  height-width  ratios,    so  that  variable  size  address 
characters  can  be  normalized  for  quantization  in  a  fixed  22x12  format.     There  are  also  possibilities 
for  developing  a  peephole  template  logic  where  the  criterial  aperture  array  can  be  automatically 
expanded  or  contracted  in  accordance  with  the  detected  boundaries  of  a  preliminary  input  symbol. 

The  idea  of  self-setting  systems,    in  contradistinction  to  self-adjusting  rerognition  devices, 
presupposes  that  the  input  material  will  be  quite  varied  and  that  a  large,   general  purpose  machine 
would  be  used  to  determine  the  characteristics  of  particular  incoming  material.     Such  a  machine 
would  then  either  proceed  to  direct  recognition-identification  processing  of  that  material  or  would 
determine  proper  routing  to  other  special-purpose  readers.     It  is  assumed,   that  is,   that  the 
machine  would  be  capable  of  looking  at  a  particular  sample  of  input  material,    determining  what  type 
style,    contrast  levels,    etc.  ,   are  involved,   and  perhaps  adjusting  master  identification  formulas  or 
other  recognition  logic  requirements  in  accordance  with  the  observed  characteristics  of  the  sample. 

Such  an  operation  would  presumably  require  either  control  over  the  input  such  that  each  piece 
of  material  to  be  read  has,   preceding  the  message  proper,   a  sample  alphabet  in  that  font,   or  else 
pre-editing  to  select  and  designate  for  machine  inspection  such  a  sample  alphabet.     In  the  latter  case 
it  must  be  recognized  that  many  texts  may  not  include  a  complete  alphabet,   and  that  very  few  would 
include  complete  alphabets  in  both  upper  and  lower  case.     However,   in  the  practice  of  human 
recognition  of  different  type -styles,    considerably  fewer  characters  than  a  complete  alphabet  are 
typically  used.     Karch,   _1/  f°r  example,    classified  approximately  1,  500  different  type-styles  on  the 
basis  of  the  discriminating  characteristics  found  in  only  seven  sample  characters,    specifically, 
lower  case  "g",    "a",    "t",    "e",    "d",   and  upper  case  "E"  and  "G". 


—  Karch,    R.R.     "How  to  recognize  type  faces,"    Ref.    245. 
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Unfortunately,   even  this  significant  reduction  in  the  number  of  sample  characters  that  would 
need  to  be  inspected  in  order  to  determine  a  particular  style  so  that  the  machine  may  select  or  adjust 
an  appropriate  recognition  logic  is  not  too  helpful.     Random  samples  of  text  were  taken  from  several 
of  the  references  listed  in  the  Appendix  of  this  report  in  order  to  test  this  point.     In  several  of  these 
texts,   no  upper  case  "G"  was  found  at  all,    and  in  most  cases  upper  case  "E"  was  found  only  after 
scanning  a  large  number  of  sentences.     Table  V  lists  the  observed  frequencies    of  the  upper  case 
letters  found  in  this  particular  sample.     Table  VI  lists  the  frequencies  of  the  upper  case  characters 
found,    excluding  those  used  in  proper  names.     Table  VII  shows  the  observed  frequencies  of  the  most 
commonly  used  initial  words  of  sentences  in  the  sample.     These  results  suggest  that  much  more 
extensive  analysis  of  character  frequencies  for  representative  samples  of  expected  input  text  might 
contribute  to  the  development  of  improved  self-setting  reader  systems. 

7.  3    Use  of  Context  in  the  Automatic  Recognition  of  Characters  and  Words 

The  possibilities  for  both  self-adjusting  and  self-setting  reader  systems  may  be  enhanced  by  the 
extent  to  which  it  is  practical  to  make  use  of  observed  context,   both  within  character  symbols  and 
between  characters.     For  example,   where  a  scanning  procedure  samples  the  image  field  at  selected 
points,   the  pickup  heads  at  these  points  may  be  weighted  in  accordance  with  their  relative  reliability 
in  identifying  the  symbols  of  the  vocabulary.     Such  weightings  may  be  specially  designed  in  order  to 
improve  reliability  of  sensing  and  reading  of  poor  quality  material.     That  is,   assuming  random 
incidence  of  superimposed  noise,    bleeding,   missing  portions  and  broken  strokes,   the  probability  of 
such  factors  effecting  recognition  can  be  reduced  by  looking  only  at  small  number  of  points  in  the 
source  pattern,   or  by  giving  variable  weights  to  input  pattern  elements. 

In  an  adaptation  of  the  peephole -template  logic  that  was  simulated  on  SEAC,    it  was  found  that 
12  apertures,    suitably  placed,    can  discriminate  between  the  32  lower  case  characters  in  a  particular 
Cyrillic  script,   but  that  to  identify  any  one  character  not  all  of  these   12  decision  points  need  to  be 
checked.     That  is,   if  aperture   1  in  the  upper-left  most  corner  of  the  image  area  is  black,   and  if 
aperture  3  which  is  on  a  line  with  aperture   1  but  in  the  center  of  the  image  area  is  also  black,   and  if 
aperture  6  in  the  geometric  center  of  the  image  area  is  also  black,    "T"  may  be  identifiable  without 
any  further  steps.     The  aperture  template  array,   which  serves  as  a  master  reference  pattern,   may 
require  some  positions  to  be  used  if  and  only  if  other  elements  in  the  input  pattern  meet  specified 
criteria,   as  for  example  the  aperture  necessary  to  distinguish  between  'shah'  and  'shchah'  (Fig.    13). 
This  procedure  obviously  involves  taking  advantage  of  the  context  of  a  particular  input  pattern 
element  in  order  to  determine  ■what  other  elements  to  look  at,   thus  minimizing  the  number  of  steps  in 
either  the  pattern-element  matching  or  in  the  checking  out  of  identification  formulas. 

In  this  special  sense,    context  dependency  may  also  be  considered  to  be  a  factor  in  optimization 
of  other  types  of  recognition  logics,    especially  those  employing  decision-tree  recognition  processing. 
That  is,   the  choice  of  next-decision-step  is  dependent  upon  the  results  of  preceding  decision  steps.        . 
Examples  are  Glovazky's  "code  mobile"  _l/  and  Gill's  minimum  scan  pattern  recognition  principles.^./ 
Chow  is  concerned  with  a  much  broader  context,   namely,   the  expectancy  of  a  given  character  occur- 
ring in  a  given  population  in  order  to  arrive  at  minimum  risk  criteria.  2J 

The  use  of  context  to  resolve  potentially  ambiguous  decisions  has  been  mentioned  in  the  case  of 
Bledsoe  and  Browning,    and  a  similar  type  of  'dictionary-filtering'  is  advocated  by  Baran  and  Estrin. 2./ 
However,   Uttley  is  probably  the  first  to  consider  context -weighting  in  terms  of  machine  methods  for 
pattern  recognition,    specifically  in  his  'conditional  probability'  models.     Thus  he  is  concerned  with 
simulating  possible  mechanisms  for  progressive  adaptation  to  variable  environments  in  which 
conditional  probabilities  are  built  up  for  a  signal  (or  feature)  B,   given  A,   and  for  these  probabilities 
in  the  context  of  a  signal  or  signal  sequence,    C.  3J 

Similarly,   Cook  has  suggested  — '    that  in  systems  where  it  is  desirable  to  reject  doubtful 
recognitions,    rejects  might  be  re-read  with  a  readjustment  of  thresholds  or  logic,    or  both,    so  as  to 
take  cognizance  of  significant  portions  of  the  character  image  in  terms  of  the  information  gained  on 


1/ 
i/ 

1/ 

i/ 

V 

6/ 


Globazky,   A.     "Determination  of  redundancies  in  a  set  of  patterns",    Ref.    166. 

Gill,  A.     "Minimum- scan  pattern  recognition,  "    Ref.    157. 

Chow,    C.  K.     "An  optimum  character  recognition  system  using  decision  functions,  "  Ref.    74. 
See  also  Refs.    73,    75,   201. 

Baran,    P.   and  G.    Estrin.     "An  adaptive  character  reader,  "    Ref.    34. 

See  Uttley,   A.M.     Refs.    510,    513,    514,    515. 

Cook,   H.  D.     "A  study  of  print  reading  systems,  "    Ref.    80. 
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TABLE  V.     Frequencies  of  Upper  Case  Characters 
Rank 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 


Character 

Incidence 

T 

420 

A 

187 

I 

161 

S 

111 

F 

81 

C 

76 

B 

56 

M 

56 

W 

50 

H 

45 

O 

43 

R 

41 

P 

33 

N 

32 

E 

29 

D 

24 

L, 

13 

U 

12 

J 

8 

V 

8 

G 

7 

K 

7 

Y 

7 

Q 

6 

Z 

1 

X 

0 
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TABLE  VI. 

Frequencies  of 

!  Upper  Case  Char, 

acters,    Excludii 

lg  Proper  Name 

Rank 

Character 

Incidence 

%  Incid. 

Ace.    % 

1 

T 

408 

36.  2% 

2 

A 

149 

13.  2% 

49.4% 

3 

I 

145 

12.9% 

62.  3% 

4 

F 

69 

6.  1% 

68.  4% 

5 

W 

46 

4.  1% 

72.  5% 

6 

O 

41 

3.6% 

76.  1% 

7 

s 

40 

3.5% 

79.  6% 

8 

c 

34 

3.0% 

82.  6% 

9 

H 

33 

3.0% 

85.6% 

10 

M 

30 

2.  7% 

88.  3% 

11 

B 

23 

2.0% 

90.  3% 

12 

E 

17 

1.5% 

91.8% 

13 

P 

17 

1.5% 

93.  3% 

14 

D 

14 

1.2% 

94.  5% 

15 

N 

13 

1.2% 

95.  7% 

16 

R 

13 

1.2% 

96.  9% 

17 

U 

11 

1.0% 

97.  9% 

18 

V 

8 

0.7% 

98.  6% 

19 

L 

6 

0.5% 

99.  1% 

20 

Q 

6 

0.5% 

99.  6% 

21 

G 

3 

0.3% 

99.  9% 

22 

K 

2 

0.2% 

100. 1% 

23 

J 

0 

24 

X 

0 

25 

Y 

0 

26 

Z 

0 
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TABLE  VII.     Frequencies  of  Initial  Words 


Rank 

Word 
The 

Incidence 
244 

%  Incid. 

Ace.   % 

1 

21.6% 

2 

In 

67 

5.9% 

27.5 

3 

This 

66 

5.9% 

33.4 

4 

A,   an 

50 

4.4% 

37.  8 

5 

It,   its 

40 

3.5% 

41.3 

6 

Figure,  Fig. 

32 

2.8% 

44.  1 

7 

These 

26 

2.3% 

46.4 

8 

If 

23 

2.0% 

48.4 

9 

However 

21 

1.9% 

50.3 

10 

Thus 

18 

1.6% 

51.9 

11 

As 

16 

1.4% 

53.3 

12 

At 

16 

1.4% 

54.7 

13 

For 

16 

1.4% 

56.  1 

14 

Since 

15 

1.3% 

57.4 

15 

On 

11 

1.0% 

58.4 

16 

To 

11 

1.0% 

59.4 

17 

There 

10 

0.9% 

60.3 

18 

We 

10 

0.9% 

61.2 

19 

Apparatus 

9 

0.8% 

62.0 

20 

But 

9 

0.8% 

62.8 

21 

During 

9 

0.8% 

63.6 

22 

While 

9 

0.8% 

64.4 

23 

All 

8 

0.7% 

65.  1 

24 

Magnetic 

8 

0.7% 

65.8 

25 

When 

7 

0.6% 

66.4 

26 

Of 

6 

0.5% 

66.9 

27 

Therefore 

6 

0.5% 

67.4 

28 

Another 

5 

0.4% 

67.8 

29 

Check, checks 

5 

0.4% 

68.2 

30 

Code, codes 

5 

0.4% 

68.6 

31 

Ink,   inks 

5 

0.4% 

69.0 

32 

Obviously 

5 

0.4% 

69.4 

33 

One 

5 

0.4% 

69.8 

34 

Under 

5 

0.  4% 

70.2 

35 

After 

4 

0.4% 

70.6 

36 

Also 

4 

0.4% 

71.0 

37 

And' 

4 

0.4% 

71.4 

38 

Automatic 

4 

0.4% 

71.8 

39 

Because 

4 

0.4% 

72.2 

40 

Even 

4 

0.4% 

72.6 

41 

Furthermore 

4 

0.4% 

73.0 

42 

No 

4 

0.  4% 

73.4 

43 

Reading 

4 

0.4% 

73.8 

44 

Such 

4 

0.4% 

74.2 

45 

Verification 

4 

0.4% 

74.6 

46 

With 

4 

0.4% 

75.0 
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the  first  reading.  .  A  pattern  recognition  research  project  at. the  Laboratory  for  Electronics  is 
concerned  with  possibilities  for  an  adaptive  system  in  which  preliminary  matching  is  to  be  made 
on  a  topological  basis,   but  in  which  additional  features  will  be  sought,    depending  upon  within- 
character  context  for  a  class  of  topologically  equivalent  patterns. 

These  examples  are  illustrative  of  possible  use  of  context  within  a  single  source  pattern, 
whether  it  be  an  individual  printed  character  or  an  address-word  for  envelope  sorting.     The  use  of 
context  for  a  series  of  input  patterns  (again,    either  single  characters  or  whole  words)  holds  promise 
for  situations  where  some  inaccuracy  can  be  tolerated,    such  as  preliminary  translations  of  foreign 
text  material  to  determine  the  general  subject  matter  content  of  an  input  document.     As  previously 
noted,    letter  frequencies  for  the  language  and  subject  matter  concerned  could  be  used,    subject  to 
check  of  context,   to  supply  identifications  of  ambiguous  or  missing  symbols. 

Check  of  inter-symbol  context  for  an  ambiguous  character,    such  as  that  of  Figure  23  which 
might  be  either  a  broken  "o"  or  a  "c"  which  has  'bled1,   may  be  carried  out  by  machine  processing  in 
accordance  with  well-defined  rules.     Table  VIII,   for  example,   illustrates  the  rules  that  might  be  used 
to  identify  the  ambiguous  symbol  in  Figure  23  if  it  occurred  in  combination  with  various  other 
characters.     Thus,   Reitwiesner  and  Weik  suggest  that  checking  "can  be  based  on  the  inherent  re- 
dundancy of  the  written  source  language  text,   the  'impossibility'  of  certain  letter  combinations,   and 
a  restricted  list  of  permissible  characters.  "  ±J 

8.     POTENTIALLY  RELATED  RESEARCH  IN  PATTERN  RECOGNITION 

In  the  preceding  section  we  have  considered  research  and  development  efforts  specifically 
related  to  prospects  for  further  progress  in  the  design  and  use  of  automatic  character  recognition 
systems.     "When  we  look  ahead  to  possibilities  for  significantly  large  symbol  vocabularies, 
potentially  related  research  on  problems  of  pattern  recognition  in  general  also  becomes  of  interest. 
For  example,   if  significantly  larger  vocabularies  are  to  include  characters  in  many  different  sizes 
and  styles  and  alphabets,    characters  that  are  handprinted  and  handwritten,   and  abstract  geometric 
shapes  that  are  involved  in  diagrams  and  other  graphic  material,   then  progress  in  research  on 
pattern  generalization  and  the  search  for  invariance  will  be  relevant  to  further  progress. 

Research  in  pattern  recognition  generally  is  usually  directed  to  quite  other  objectives  than  the 
development  of  practical  character  reading  equipment.     Among  the  purposes  to  which  it  may  be 
directed  are  those  of  developing  models  that  simulate  observed  neurophysiological  structure  and 
function.     To  an  extent,   the  demonstration  of  a  mechanism  that  works  as  the  brain  is  assumed  to  do 
may  also  add  information  to  the  knowledge  derived  from  neurophysiology  and  related  sciences.     Quite 
different  approaches  may  be  followed,   however,    ranging  from  the  neuron  network  simulation  work 
to  such  work  as  Uhr's  where  an  attempt  is  made  to  simulate  the  perception  phenomena  stressed  in 
the  Gestalt  school  of  psychology. 

The  results  of  work  in  these  research  areas  may  not  be  obviously  applicable  to  recognition 
system  design,   but  it  is  recognized  by  many  workers  in  the  field  that  these  results  may  provide  the 
necessary  impetus  and  insight  for  future  progress.     In  particular,    some  workers  feel  that  the 
directions  toward  which  the  more  general  pattern  recognition  research  efforts  point  are  precisely 
those  that  have  been  most  neglected  in  character  recognition  developments  to  data.     Thus  Uhr  states: 

"There  is  a  critical  need  for  better  perceptual  mechanisms  in  machines.     The  sensory 
mechanism  problem  is  probably  close  to  solution,    with  a  rapid  advance  in  the  art  of  flying- 
spot  scanners  and  contour  tracers,    photo-electric  cell  arrays,   and  similar  optical- 
mechanical-electronic  gadgets.     But  the  true  perceptual  processes,    the  information- 
processing  and  recognizing  of  the  inputs  sensed  are  still  at  a  primitive  stage.  "  2/ 


1/ 
2/ 


Reitwiesner,   G.  W.   and  M.  H.    Weik.     "Survey  of  the  field  of  mechanical  translation  of 
languages.  "    Ref.    380. 

Uhr,    L.     "Intelligence  in  computers:    the  psychology  of  perception  in  people  and  in 
machines,"    Ref.    494,    p.    178. 


114 


Figure  23.     Ambiguous  Character 
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TABLE  VIII.     Sample  Rules  for  Interpreting  the  Ambiguous  Character  of  Figure  23 


X 

7 

= 

o 

X 

X 

7 

Space 

e 

o 

X 

? 

X 

Space 

= 

c 

or 

O 

X 

X 

7 

B 

Space 

= 

o 

X 

X 

7 

D 

Space 

= 

o 

X 

X 

7 

E 

Space 

= 

c, 

except  in  shoe,  aloe,  oboe 

X 

X 

7 

F 

Space 

= 

o 

X 

X 

7 

G 

Space 

= 

o 

X 

X 

7 

H 

Space 

= 

c 

X 

X 

7 

K 

Space 

= 

c 

or 

O 

X 

X 

7 

L 

Space 

= 

o 

X 

X 

? 

M 

Space 

= 

o 

X 

X 

7 

N 

Space 

= 

o 

X 

X 

? 

O 

Space 

= 

o 

X 

X 

7 

P 

Space 

= 

o 

X 

X 

7 

R 

Space 

= 

o 

X 

X 

7 

T 

Space 

- 

c 

or 

O 

X 

X 

7 

W 

Space 

= 

o 

X 

X 

7 

Y 

Space 

c 

or 

O 

X 

A 

7 

K 

Space 

_ 

c 

X 

E 

? 

K 

Space 

= 

c 

X 

I 

7 

K 

Space 

= 

c 

X 

o 

7 

K 

Space 

= 

c 

or 

O 

X 

u 

7 

K 

Space 

= 

c 

B 

X 

7 

K 

Space 

= 

c 

or 

O 

B 

o 

7 

K 

Space 

= 

o, 

or 

C  (bock) 

c 

X 

7 

K 

Space 

= 

c 

or 

O 

c 

o 

7 

K 

Space 

= 

c 

or 

O 

D 

X 

7 

K 

Space 

= 

c 

H 

X 

7 

K 

Space 

= 

c 

or 

O 

H 

o 

7 

K 

Space 

= 

c 

or 

O 

L 

X 

? 

K 

Space 

= 

c 

or 

O 

L 

o 

7 

K 

Space 

= 

c 

or 

O 

M 

X 

7 

K 

Space 

= 

c 

N 

X 

7 

K 

Space 

i 

c 

or 

O 

N 

o 

7 

K 

Space 

= 

o 

P 

X 

7 

K 

Space 

= 

c 

R 

X 

7 

K 

Space 

= 

c, 

or 

O (rook) 

J 

X 

7 

K 

Space 

= 

c 

T 

X 

7 

K 

Space 

= 

c 

or 

O 

T 

o 

7 

K 

Space 

— 

o, 

or 

C  (tock) 
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Similarly,,    Young  in  his   I960  evaluation  of  the  state  of  the  character  reading  art,  _l/  ^3iS  noted 
advances  in  scanning,    sensing,   and  comparison  processing,   but  points  out  the  relative  neglect  among 
the  system  designers,   of  the  "semantic"  features  of  character  patterns.     It  is  precisely  these 
'semantic'  features  that  are  presumably  involved  in  the  development  of  criterial  feature,   parallel 
testing  and  synthetic  decision-making  systems  capable  of  handling  open-ended  vocabularies  in  a 
general  purpose  manner. 

System  design  decisions  to  employ  criterial  feature  extraction  and  analysis  for.  automatic 
character  recognition  are  usually  based  upon  one  or  more  of  the  following  objectives:: 

(1)  To  reduce  the  redundancy  of  the  pattern  information  to  be  processed; 

(2)  To  provide  economy  in  reference  pattern  storage  requirements; 

(3)  To  provide  flexibility  with  respect  to  possibilities  for  discriminating  characters 
regardless  of  variations  in  position-registration,   in  size,   and  to  some  extent 

in  specific  details  of  character  shape; 

(4)  To  increase  the  likelihood  of  correct  identification  even  although  the  exact  source 
pattern  has  not  previously  occurred,  that  is,  to  provide  for  some  open-endedness 
of  vocabulary; 

(5)  To  increase  the  flexibility  of  the  system  with  respect  to  the  number  and  variety 
of  fonts  that  can  be  accommodated  without  major  re-design  of  logic,  and  without 
requirement  for  precise  advan  ce  identification  of  a  particular  font;  and 

(6)  To  simulate  processes  that  are  presumed  to  be  involved  in  human  perception  and 
pattern  recognition. 

The  choice  of  specific  criterial  features  to  be  used  in  a  particular  recognition  system  may  be 
based  either  on  a  priori  assumptions  as  to  the  relatively  invariant  features  of  characters  that  are  to 
be  accommodated  or  by  various  processes  of  empirical  testing  and  measurement.     Areas  of  basic 
research  in  pattern  recognition  which  are  related  to  the  design  or  improvement  of  criterial  features 
extraction  methods  therefore  include  investigations  of  relative  invar iance  of  pattern  features  in  both 
human  and  machine  perception,    computer  simulation  and  testing  of  proposed  property-filtering 
measurements,   and  experimentation  with  self -organizing  systems. 

Determinations  of  pattern  invariance  under  various  transformations,    computer  simulation,   and 
experiments  with  self-organizing  systems  are  also  used  in  recognition  systems  that  depend  on 
statistical  analyses  and  randomly  generated  operators.     In  particular,    computer  programs  designed 
to  simulate  pattern-learning  are  usually  directed  either  to  the  determination  of  sets  of  defining 
features  for  classes  of  patterns  or  toward  investigation  of  the  results  that  may  be  achieved  by  testing 
a  retinal  field  statistically.^./    The  latter  is  exemplified  in  neuron  network  simulation  projects  which 
use  statistical  techniques  for  the  building  up  and  reinforcing  of  explicit  response  behaviors  in  place 
of  initially  random  connections. 

Both  the  criterial  and  semantic  and  the  random  or  statistical  approach  may  be  brought  together, 
however,    in  research  programs  in  pattern-recognition.     Thus  Uhr  has  made  a  series  of  investiga- 
tions starting  with  a  bias  toward  abstractive,    criterial  analysis  techniques,   based  at  least  in  part  on 
a  priori  ideal  properties  of  given  characters  and  shape.     He  has  more  recently  explored  the 
possibilities  of  an  adaptive,    criterial  analysis  technique,   based  on  a  posteriori  results  derived  from 
random  concatenations  of  input  pattern  elements  in  a  'teaching',   then  'recognizing',    sequence.     These 
later  Uhr-Vossler  adaptive  methods  are  still  abstractive  in  a  processing  sense  as  are  those  of 
Novikoff,   for  example,   but  they  are  so  with  respect  to  syntactic  rather  than  necessarily  semantic 
features.     In  other  words,   the  criterial  features  that  discriminate  between  the  vocabulary  characters 
of  systems  such  as  those  of  the  Uhr-Vossler,   Novikoff,    Bledsoe -Browning,   Alt  for  computer 
moments,   and  similar  systems,   are  not  necessarily  recognizable  as  criterial  by  the  human  perceiver. 

For  both  the  random  and  the  criterial  approaches,   then  prospects  for  further  progress  in 
pattern  recognition  systems  appear  to  lie  in  at  least  three  areas  of  related  basic  research,    defined  by 
Gill  as  follows: 


1/ 


2/ 


Young,    D.A.     "Automatic  character  recognition",    Ref.  539,   p.    6:     "Despite  the  large  quantity 
of  research  work  that  is  being  undertaken,   however,   most  of  it  is  concentrated  around 
problems  of  sensing,   filtering,   and  high-speed  sorting  of  scan  data,   and  too  little  attention  is 
being  paid  to  the  fundamental  semantic  features  of  character  patterns.  " 

Compare  Ref,    59,    p.    5. 
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"(a)      Deeper  analysis  of  the  redundancies  inherent  in  the  various  classes  of  pattern 
sources, 

(b)  Formulation  of  procedures  for  determining  optimal  sets  of  transformations 
required  for  recognizing  given  sets  of  patterns, 

(c)  Simulation  of  learning  processes  with  digital  computers.  "  — ' 

We  shall  therefore  consider  briefly  certain  aspects  of  these  areas  of  potentially  related  research  in 
pattern  recognition,    including  the  areas  of  search  for  invariant  features,    search  for  criteria  of 
pattern  separability,   possibilities  for  machine  abstracting  and  automatic  classification,   and  machine 
models  of  perception,    recognition  and  pattern  generalization. 

8.  1    The  Search  for  Relative  Invariance 


In  the  character  recognition  systems  that  have  been  considered,   we  have  noted  certain  attempts 
to  provide  for  some  measure  of  identification-criteria  invariance,    despite  variability  of  source 
pattern  presentations  of  the  same  character.     For  example,    source-to-input  pattern  transformations 
in  some  systems  allow  some  minor  variability  with  respect  to  noise  and  minor  differences  of 
character  style,    such  as  serifs.     Moreover,   what  we  have  called  the  'criterial  features1  approach 
does  include  recognition  principles  which  emphasize  precisely  those  features  that  are  relatively 
invariant  for  vocabularies  including  characters  that  may  occur  in  different,   but  similar  styles. 
Thus  Rochester  claims: 

"The  subject  character  recognition  mechanism  looks  for  the  basic  features  most 
invariant  throughout  the  range  of  fonts.     In  an  ideal  system,   these  features  should  be 
completely  independent  of  such  features  as  the  absolute  size  of  the  character,    slant 
proportions  and  the  presence  of  serifs.     The  criterion  of  the  basic  form  of  the  character 
is  its  continued  existence  through  all  conceivable  distortions  of  the  character  up  until  the 
point  where  a  human  observer  becomes  uncertain  of  the  identity  of  an  isolated  specimen.  "  ~J 

Similarly,    Greene  has  noted  that  most  of  the  existing  devices  for  reading  written  characters 
fall  into  that  class  of  pattern  recognition  systems  where  the  device  is  designed  to  respond  to  any  one 
of  a  class  of  patterns  in  terms  of  the  answers  to  questions  that  are  built  into  the  machine.     He  goes 
on  to  suggest  that:     "Designing  these  devices  means  finding  a  combination  of  stable  and  reproducible 
perceptual  determinants  that  serves  to  discriminate  just  the  class  you  are  looking  for.  "  ±1 

The  choice  of  criteria  to  improve  the  recognition  process  for  variable  character  forms  has 
been  variously  based,   as  we  have  previously  noted,  on  a  priori   or  theoretical  assumptions  as  to  the 
characteristics  most  likely  to  be  significant  in  a  given  vocabulary,   or  on  empirical  observation, 
machine  simulation,    and  trial-and-error  testing.   2/    Examples  of  the  latter  include  Stone,   2/ 
Weeks,    °/and  Howard,   jj  among  others.     In  the  Russian  type  style  study  made  at  New  York 
University,    8/     manual  observations  for  different  Russian  fonts  and  for  a  wide  range  of  defects  and 
mutilations  of  any  given  character  led  to  recognition  of  the  need  to  establish  "character- 
discriminating  procedures  based  on  essentially  invariant  gross  aspects  of  characters".     Some  of  the 
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Gill,   A.     "Pattern  recognition",    Ref.    159,   p.    677. 

Rochester,    N.  ,    et  al.     U.  S.  Patent  2,  889,  535,    Ref.    388. 

Greene,    P.  H.      "Networks  for  pattern  recognition,"    Ref.    179,    p.    1. 

Compare,   for  example,    Minsky  and  Selfridge:     "...    patterns  can  often  be  defined  by  listing 
the  properties  which  distinguish  their  exemplars  from  those  of  other  patterns.     In  the 
important  case  of  patterns  whose  definitions  are  not  known  in  advance  but  for  which  examples 
are  available,    we  can  use  experience  to  gather  (statistical)  evidence  about  the  distribution  of 
properties  among  the  patterns."     (Ref.  312,    p.  5. ) 
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Ref.  221. 
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aspects  suggested  for  this  purpose  in  the  Russian  study  were:    measurements  of  the  extent  to  which 
characters  exceeded  a  constant-height  band,   total  width,   open  or  closed  structure,    symmetry  as 
determined  with  respect  to  the  optical  center  of  gravity,    compactness  as  determined  by  whether  the 
optical  center  of  gravity  coincides  with  figure  or  with  ground,   and  articulation  as  determined  by 
effects  of  dissecting  the  character  either  at  various  levels  or  by  a  superimposed  grid. 

Since  one  of  the  major  requirements  for  a  discriminant  or  property  is  that  it  be:  "...    invariant 
under  the  commonly  encountered  equivalence  transformations,  "  ±J    we  also  find  the  gathering  of 
statistical  evidence,   especially  by  machine  simulation,   in  this  area.     In  recent  work  directed  by  Alt 
at  the  National  Bureau  of  Standards,    2/  there  has  been  an  investigation  of  the  extent  to  which  quantized 
patterns  are  characterized  by  their  moments.     Certain  combinations  of  these  moments  provide  such 
information  as  relative  distribution  of  black  with  respect  to  distance  from  center  of  gravity,    symmetry 
about  the  x  and  y  axes,   and  the  like.     Computer  simulation  has  been  used  to  determine  combinations 
that,   for  a  given  vocabulary  set,   are  invariant  for  input  members  of  the  set  under  transformations  of 
location,    size,    stretching  and  squeezing,    rotation  to  the  extent  of  the  slanting  found  in  italic  as  versus 
Roman  versions  of  the  same  character,    some  noise,   and  minor  changes  or  embellishments  such  as 
serifs. 

In  the  search  for  relative  invariance  among  hand-drawn  patterns  (including  handwritten 
characters,   geometric  shapes,   line  drawings,   and  the  like),   both  the  intuitive  (a  priori)  and  the 
heuristic  (empirically  determined,    machine-tested,   machine -generated)  methods  for  property- 
selection  have  been  used.  _£/    Techniques  that  have  already  been  mentioned  include  those  of  Doyle,  4/ 
Sherman,  _'    and  Grimsdale,    ~J   for  example.     A  special  combination  of  theoretical  and  empirical 
approaches  to  the  search  for  invariant  features  is,   however,    involved  in  certain  areas  of  pattern 
recognition  research.     That  is,   it  is  assumed  on  theoretical  grounds  that  factors  important  in  human 
or  animal  perception  and  pattern  recognition  may  also  be  important  in  machine  recognition. 

The  Himmelman  and  Chu  work  at  RCA  previously  mentioned  with  respect  to  the  machine  idea  of 
an  "E",  2.1    included  detailed  studies  of  human  performance  with  systematically  deteriorated  character 
samples,   with  the  determination  of  the  relative  importance  of  various  critical  areas  in  the  identifica- 
tion of  badly  degraded  characters,   and  with  determination  of  what  character  stroke  combinations  are 
associatively  clustered.     Eden  and  Halle  °/have  discussed  both  the  synthesis  of  cursive  handwriting 
and  its  analysis,   finding  that  18  strokes  appear  to  be  discriminative  for  well-formed  Latin  scripts. 
Neisser  and  Weene.2/  have  also  studied  human  recognition  performance,   using  the  same  handprinted 
upper  case  letters  used  in  the  Sherman  machine  experiments,   to  determine  types  of  error,   overall 
accuracy,   and  confusion  data. 

Others  who  have  considered  the  factors  in  human  perception  as  potentially  applicable  to  machine 
recognition  of  patterns  include  Uhr  and  investigators  at  the  System  Development  Corporation. 
Typical  of  conclusions  reached  are  the  following: 
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See  Minsky,   M.     "Steps  toward  artificial  intelligence,  "  Ref.  314,   p.    13. 

A  report  on  results  by  F.  L.   Alt,   tentatively  titled:     "Digital  pattern  recognition  by  moments,  " 
is  in  process  of  preparation. 
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"Psychophysical    and  introspective  data  provide  elements  into  which  figures  might  be 
decomposed  with  maximal  retention  of  important  information.     These  elements  are  second 
degree  curves,    8  lengths,    8  slopes,    5  curvatures,   with  differentiations  along  each  of  their 
dimensions.     Humans  make  similar  differentiations  in  absolute  judgment  experiments. 
These  geometric  elements,   along  with  information  about  their  interconnections,    seem  to 
contain  all  of  the  information  that  is  usually  elicited  when  someone  is  asked  to  give  a 
complete,    detailed  description  of  a  line  pattern.  "    ±f 

"Consider  lines  of  eight  possible  lengths,   eight  possible  slopes  and  eight  possible 
curvatures.     This"  is,    roughly,   the  capacity  of  the  human  perceiver  when  he  makes   absolute 
judgment.    .    .   A  program  I  am  now  coding  will  .    .    .    follow  line  segments,    while  under  the 
control  of  an  assessing  subroutine,   until  it  has  identified  a  complete  element  as  one  of  this 
9-bits  worth  of  possible  elements  ...     It  will  identify  the  next  element,    store  the  relative 
location  at  which  elements  touch,   and  continue  until  it  has  completed  the  figure.     These 
elements  seem  almost  'natural1  ways  of  describing  figures—especially  man-made  figures 
such  as  letters.     For  an  example,   an  'A'  equals  a  vertical  left  and  a  vertical  right  touching 
at  the  top;  with  a  horizontal  line  joining  their  middles  .    .    . 

The  letters  to  be  'recognized'  are  stored  in  the  computer's  memory,   along  with  their 
characterization  lists --literally,   what  they  look  like..  "_2/ 

Investigations  of  factors  in  perception  presumed  to  obtain  in  lower  order  animals,   from  the 
standpoint  of  machine  simulation  of  pattern  recognition  include  those  of  Loebner,   who  uses  differenc- 
ing between  vertical  and  horizontal  pairs  to  simulate  the  horizontal-vertical  discrimination  found  in 
the  eye  of  the  frog,  _?/  and  Deutsch,   who  has  proposed  a  shape  recognition  technique  based,   in  effect, 
upon  the  projection  of  a  contour  upon  an  opposing  one.    4/  Van  Bergeijk  and  Harmon  have  recently 
built  a  model  of  a  small  neural  net  that  is  capable  of  identifying  curved  as  opposed  to  straight  lines.  J>/ 
Kamentsky,   with  a  photoelectric  model  of  a  retinal  mosaic,   develops  methods  for  the  identification  of 
angles,   endpoints,    closed  loops,   and  the  like.  2.'   Piatt,   however,    suggests  that  straightness, 
curvature,   equidistance,   and  similar  primitive  elements  of  visual  pattern  recognition  arise  not  from 
direct  operations  on  the  pattern  as  received  by  the  retina,   but  rather  from  the  self-congruence  of  the 
sources  of  the  pattern  in  the  external  world  under  the  rotations  which  the  spherical  eyeball  can 
perform.    "]_' 

A  different  approach  to  the  use  of  a  priori  or  theoretical  assumptions  with  respect  to  significant 
features  of  patterns  that  may  provide  criteria  of  relative  invariance  is  provided  in  consideration  of 
applicability  of  theoretical  principles  such  as  those  of  graph  theory  (Sherman,    °/  Garmash<L/J<    and 
integral  geometry.     Novikoff  and  others  at  the  Stanford  Research  Institute  have  been  investigating 
the  applicability  of  theoretical  principles  of  integral  geometry  to  the  problems  of  determining 
relatively  invariant  features  of  patterns.     One  of  the  techniques  proposed  as  a  result  of  these  studies 
involves,   for  example,   the  construction  of  reference  patterns  or  master  identification  formulas  by 
detecting  the  most  probable  intersections  of  samples  of  the  characters  to  be  included  in  the 
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vocabulary  with  one  or  more  randomly  tossed  reference  pattern  elements.    W 

In  several  of  the  contour -following  or  curve -tracing  proposals  that  we  have  previously 
mentioned,   2/ there  has  been  an  emphasis  on  the  search  for  relatively  invariant  features  and  an 
assumption  that  properties  such  as  curves  and  slopes  are  reasonably  stable  under  transformations  of 
size  and  location.     Weiss  and  Johnson  in  a  recent  patent  assigned  to  IBM  for  a  'form  recognition1 
system,   claim  the  following: 

"In  the  system  of  the  present  invention  the  registration  problem  is  essentially  eliminated 
by  basing  the  comparison  and  recognition  not  upon  geometrical  coordinate  information  con- 
cerning the  unknown  curve  or  shape  as  such,   but  rather  by  basing  it  upon  a  comparison  of 
computed  values  of  the  curve  which  are  invariant,    or  at  least  semi-invariant,    with  respect  to 
transformations  of  translation,    rotation  and  magnification.     For  the  purposes  of  this  application, 
a  property  of  the  curve  will  be  said  to  be  invariant  with  respect  to  a  given  transformation  or 
set  of  transformations  when  the  property  has  a  value  which  varies  with  a  point  traversing  the 
curve  but  which,   for  a  given  point  on  the  curve,    does  not  change  when  the  curve  is  mapped  on 
another  curve  by  any  succession  of  applications  of  the  transformation  or  transformations  with 
which  the  property  is  invariant.     Similarly  a  property  of  the  curve  is  said  to  be  semi-invariant 
with  respect  to  a  given  transformation  when  it  has  a  value  which  changes  by  at  most  a  constant 
additive  term  when  the  curve  is  mapped  on  another  curve  by  any  succession  of  applications  of 
transformations  with  respect  to  which  the  property  is  semi-invariant.     Furthermore,   it  can 
be  shown  that  the  functional  relationship  between  any  two  invariant  properties  characterizes  the 
whole  class  of  plane  curves  which  can  be  mapped  onto  one  another  by  a  succession  of 
applications  of  the  transformations  with  respect  to  which  the  properties  are  invariant.  "  3/ 

Weiss  and  Johnson  provide  a  computing  means  to  derive  the  values  for  an  unknown  curve  for  a 
comparison  with  standard  values  for  known  curves  or  shapes. 

Computer  simulation  has  been  used  not  only  to  arrive  at  invariant  properties  in  curve-tracing 
techniques  lor  recognition  of  line  drawings  but  also  to  test  such  systems.     Haller,    in  a  704  simulation 
program,  — '    provides  for  a  tracing  head  which  traverses  successive  small  areas  of  attention  along 
the  contour  lines  of  a  quantized  pattern.     This  tracing  head  can  be  oriented  in  any  one  of  eight 
directions  and  it  follows  the  line  along  a  given  direction  as  long  as  possible.     However,   the  head  may 
explore  for  short  distances  at  right  angles  to  this  direction,   following  small  fluctuations  and  avoiding 
minor  gaps  which  may  be  caused  by  noise.     The  input  pattern  elements  are,   in  effect,   the  beginning 
and  end  points  of  the  runs  occurring  with  the  tracing  head  in  a  single  orientation  and  indication  of 
whether  or  not  the  run  was  terminated  by  an  end  of  the  line.     Thus,   this  is  a  reductive  curve -tracing 
input  technique,   and  it  is  followed  by  a  decision-tree  program  for  recognition-identification. 

Two  other  proposed  computer  programs  for  search  of  relatively  invariant  properties  by  curve- 
tracing  techniques  make  use  of  list-processing  languages,   Hodes  with  Lisp,  —'and  Shepard  with 
IPL-V-    ~J  Hodes  summarizes  his  studies  as  follows: 

"A  method  for  mechanically  processing  line  drawings  is  being  programmed.     The 
process  begins  with  a  pattern  that  takes  the  form  of  marked  points  in  a  100x100  square 
array.     First  the  pattern  is  converted  to  a  readily  usable  form  by  a  line -follower 
program,   which  collects  information  about  the  lines  and  vertices  and  prints  out  a  list- 
structure  description  of  the  pattern.     The  original  input  pattern  is  then  discarded,    as 
LISP  programs  process  the  list-structure  description.     LISP  programs  which  have  been 
written  include  one  to  tell  whether  two  patterns  are  the  same  under  renumbering  of 
vertices  and  one  for  regaining  simple  components  of  overlapping  patterns.  "  JJ 

Shepard  similarly  uses  an  initial  100x100  matrix  and  derives  a  list-structure  description.     In 
the  Shepard  description,   lines  through  various  small  overlapping  subregions  are  noted  and  their  local 
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Novikoff,  A.B.  J.     "Integral  geometry  as  a  tool  in  pattern  perception.  "    Ref.    340;  see  also 
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See  pp.    53-55    of  this  report. 
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slopes,    lengths  of  arc,    and  whether  there  are  end-points,   branch-points,    or  direction-changes 
(corners)  are  recorded.     The  sublists  constructed  in  this  way  are  connected  together  in  order  to 
preserve  the  topological  structure  of  the  original  pattern.     Various  property-extraction  routines  are 
then  applied  to  the  list-structure  representation,   yielding  a  series  of  yes-no  indications  for  a  given 
stimulus  pattern.     Finally,   the  program  is  intended  to  display  certain  "learning"  mechanisms.     These 
are  described  as  follows: 

"At  the  heart  of  this  program  is  a  list  of  the  names  of  the  property-extractor  sub- 
routines. .  .Associated  with  the  name  of  each  property  extractor  on  this  list  are  two  other 
things:     The  first  is  a  weight  that  determines  the  probability  of  application  of  that  property 
extractor  (via  the  stochastic  selection  operation  J- 16  of  IPL-V).     The  second  is  a  sublist 
of  responses  that  the  learning  program  has  found  to  be  associated  with  that  property 
together  with  weights  that  the  learning  program  attaches  to  the  responses  so  as  to  reflect 
the  frequency  of  those   associations.  "iZ. 

Techniques  designed  to  search  for  relatively  invariant  features  are  thus  often  intended  to 
combine  theoretical  assumptions  with  subsequent  empirical  observations,   followed  by  readjustments 
and  modifications,    such  that,   for  a  large  number  of  samples,   a  type  of  'learning'  may  be  said  to 
occur.     We  shall  discuss  other  examples  where  a  degree  of  learning  is  claimed  in  terms  of  machine 
models  of  perception,    recognition,   and  pattern  generalization.     We  note,  however,   that  an  important 
distinction  can  be  made  with  respect  to  the  hypothesis  that  criterial  features  or  characteristics 
significant  for  discrimination  can  be  isolated  by  essentially  random  operations  upon  learning  samples 
(Rosenblatt,  _/  Bledsoe -Browning,  _3/  Novikoff,    4/   and  others),   as  against  the  hypothesis  that  the 
significant  features  reflect  semantic -dependency  with  respect  to  specific  recognizable  patterns  (Barus 
in  contrast  to  the  Perceptron  techniques;  jJ/Unger;  6/  Kamentsky;  2/ Uhr;  jJ/Kelly- Singer,    9yetc). 
In  particular,   there  is  in  the  one  case  a  breaking-up  or  disregarding  of  connectivity  and  other 
characteristics  of  interdependence  of  pattern  elements,   and  on  the  other  an  emphasis,   at  least  in 
part,    on  precisely  such  interdependence. 

With  respect  to  the  latter  case,   Kamentsky  has  defined  the  technique  which  he  terms  'feature 
matching'  as  follows: 

"The    individual  elements  on  many  patterns.  .  .are  not  independent,    since  recognizable 
patterns  contain  constraints  on  form.     Parts  of  patterns  can  be  classified  in  terms  of 
independent  groups  of  elements.  .  .Some  of  these  groups  include  the  geometrical  parameters, 
straight,    curved,   closed  or  open,   and  breaks  or  corners.     These  may  be  independent  of 
absolute  position,    size,   noise,   and  some  changes  of  form.     We  shall  call  these  'features  of 
the  pattern'.     Recognition  of  patterns  is  possible  if  a  sufficient  set  of  relevant  features  can 
be  extracted  from  the  signal  field.  "     10/ 
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Derivation  of  such  'relevant'  features  by  means  of  machine  measurement  of  samples  and 
subsequent  tests  of  identification  has,   as  we  have  noted,   been  investigated  by  Selfridge,    Dinneen, 
Farley,   and  Clark,   at  M.  I.  T.  ,   _/   by  Grimsdale  et  al  at  the  University  of  Manchester,    2/  and 
by  Unger  and  others  with  respect  to  machine  models  or  spatial  computers.  J./   Uhr  has  recently 
reviewed  a  number  of  such  techniques.   2.1 

Minsky,    considering  the  case  of  visual  pattern  recognition  with  respect  to  the  more  general 
area  of  so-called  'artificial  intelligence',   has  grouped  together  both  interdependent  and  independent 
feature  possibilities  in  a  listing  of  the  operations  and  transformations  that  may  be  useful  in  the  search 
for  relative  invariance  for  various  interesting  sets  of  patterns.  _'     These  he  classifies  as:    local 
transformations,    including  local  averaging,   edging,   and  recognition  of  particular  local  configuration 
or  feature  extraction;  global  or  holistic  transformations,   including  translation,    rotation,   expansion, 
contraction,   filling  in  of  hollow  figures,    location  of  optical  center  of  gravity,   and  operations  to 
determine  connectivity;  and  "functionals",   or  operations  that  count  or  encode,    such  as  'blob'  counters, 
moment  of  figure  with  respect  to  a  given  point,    slope  of  line,  distance  between  two  points;  and  other 
transformations  such  as  projections  onto  a  line  or  axis,    or  the  mapping  of  a  pattern  perimeter  along 
some  reference  axis.     The  search  for  relative  invariance  includes  investigation  of  area-preserving 
transformations,    shape-preserving  transformations,    image  enhancement,    contour  projection,    contour- 
direction  sequences,   and  selection  of  c  rite  rial  features. 

Thus   we  find  a  variety  of  research  approaches  to  problems  of  relative  invariance  in  the  pattern 
recognition  field,   many  of  which  may  provide  useful  cues  for  further  progress  in  the  development  of 
general -purpose  character  recognition  techniques. 

8.  2    The  Search  for  Pattern  Separability 

The  search  for  relative  invariance  of  character  patterns  in  terms  of  significant,    relevant, 
and  interdependent  features  provides  one  basis  for  separating  patterns  into  classes  or  categories, 
which  is  the  requirement  for  pattern  detection  in  contradistinction  to  recognition  as  one  of  a  finite 
number  of  known  pattern-types,   as  we  have  previously  noted.     This  is,   in  effect,   a  criterion  for 
pattern  separability  using  'Gestalt'-type  principles.     In  addition,   however,    research  efforts  in 
pattern  recognition  which  may  be  applicable  to  future  improvements  in  automatic  character 
recognition  systems  include  investigations  of  linear  separability,   probabilistic  separability,   and 
statistical  separability. 

The  question  of  linear  separability  is  in  the  first  and  most  obvious  sense  related  to  the 
consideration  of  the  Boolean  functions  of  n-variables,   which  are  the  black-white  determinations  per 
cell  of  a  grid  superimposed  on  the  source  pattern.     For  example,    Stearns  states  that  the  general 
problem  of  pattern  recognition  is  to  be  regarded  as:    "A  problem  wherein  the  recognition  device  is 
presented  with  a  plane  array  of  black-or -white  elements  and  must  decide  to  which  general  class 
(pattern)  this  array  belongs."  6/   Various  techniques  for  the  minimization  of  decision  paths  with 
respect  to  two-valued  coordinate  descriptions,   including  those  of  Glovazky,  U   Gill,    °J  Blokh,  jj 
and  Howard,    10/  are  related  to  the  problem  of  linear  separability. 
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A  concept  involving  the  representation  of  the  space  of  possible  patterns  as  a  binary    n-cube  in 
n-space,   with  each  vertex  of  the  cube  defining  a  specific  combination  of    n    binary  variables,   and  the 
use  of  hyperplanes  as  partitioning  surfaces  to  achieve  pattern  separation,   has  been  explored  by 
Mattson.  }J   Kirsch,    2/  in  a  discussion  of  Mattson's   1959  paper,   assumes  that  the  criterion  of  linear 
separability  is  to  be  applied  in  the  case  of  reference  patterns  that  are  the  direct  binary  quantizations 
at  the    input    level  of  a  coordinate  description  of  a  source  pattern.     More  generally,   and  more 
significantly,  however,   the  Mattson  proposals  are  applicable  to  any  binary  representation  of 
characteristics  of  patterns,    specifically  including  the  case  where  this  is  a  two-valued  (yes-no) 
property  list  score. 

As  we  have  observed  previously,   criterial  features  extraction  techniques  often  result  in  a  form 
of  property  list  and  thus  may  be  encoded  either  in  a  binary  code  indicating  the  presence  or  absence  for 
a  given  pattern  of  each  of  the  properties  considered  or  in  a  code  statement  indicating  a  particular 
value  or  range  of  values  as  detected  for  each  property.     In  this  sense,   the  criterial  features  techniques 
involve  'linear  statistical  classification',    defined  by  Laemmel  as  follows: 

"It  is  desired  to  tell  which  of  K  classes  an  individual  belongs  to.     The  classification  is 
to  be  based    on  the  results  of    n    tests  or  measurements  which  are  made  on  the  unknown 
individual.     The  method  of  classification.  .  .is  called  'linear'  because  the  results  of  the 
separate  tests  are  combined  in  a  linear  way,   and  'statistical'  because  the  result  of  the 
classification  is  usually  known  to  be  correct  only  with  some  probability  less  than  one.  "  3/ 

Results  of  applying  tests  for  various  character  features  in  character  recognition  techniques  such 
as  those  of  Weeks,   Doyle,   and  Grimsdale  et  al,   are  probabilistic  in  this  sense  and  thus  provide 
examples  of  application  of  linear  statistical  classification  techniques.     Weeks,   for  example,   claims 
the  following: 

"First  a  number  of  statistical  properties  of  the  character  to  be  recognized  are  studied 
and  tested.     An  example.  .  .  is  a  determination  of  the  number  of  crossings  encountered  when 
the  light  beam  moves  across  the  character  in  a  particular  direction  and  position.  .  .    The 
probability  of  occurrence  of  different  numbers  of  crossings  by  a  particular  scan  line  can  be 
measured  by  testing  a  large  number  of  characters  and  determining  the  fractional  occurrence 
of  the  number  of  crossings  for  each  character.  "  4/ 

In  the  Grimsdale,   et  al,   techniques,   properties  to  be  used  are  again  determined  by  tests  on  many 
samples,   and,   when  an  unknown  is  thereafter  tested  the  method  results  either  in  identification  or, 
if  there  is  ambiguity,    inprintout  of  the  probability  scores  of  the  several  characters  the  source 
pattern  may  represent.  2J 

Criteria  for  probabilistic  separability  in  addition  to  those  cases  where 
interrelationships  betwen  properties  or  features  are  assumed  include  those  where  the  results 
of  tests  or  measurements  are  assumed  to  be  independent,    as  in  the  technique  for  finding 
optimum  sets  of  such  tests  described  by  Howard.  _'   Closely  related  are  differential  weightings 
allowed  in  some  of  the  coordinate  description  techniques,    such  as  that  of  Taylor.  _' 
Weightings  of  single  cells,    in  terms  of  conditional  probabilities  determined  by  experience 
with  samples,    was  apparently  first  suggested  by  Uttley§_'   and  one  °f  hi.3  conditional 
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probability  machine  models,    demonstrated  at  the  Teddington  Symposium  on  the  Mechanization  of 
Thought  Processes  in  1958,    successfully  recognized  "T"  shapes.     Related  work  ranging  through 
pattern  recognition  research  to  practical  suggestions  for  character  recognition  has  also  included  that 
of  Highleyman  and  Kamentsky,  ±f  Stearns,   £/   and  Baran  and  Estrin.    3/ 

Highleyman  and  Kamentsky  have  re -investigated  the  Bledsoe-Browning  technique  of  point-pair 
n-tuple  probabilistic  sampling  with  considerably  less  success  than  was  reported  by  the  original 
investigators.     On  the  other  hand,   Highleyman  has  recently  been  awarded  a  patent,   assigned  to  Bell 
Telephone  Laboratories,    in  which  probability  matrices  are  used  in  conjunction  with  weighted  area 
scanning  methods.     A  representative  claim  is  as  follows: 

"...   Apparatus  for  classifying  fine  trace  patterns  comprising  a  record  containing  a 
line  trace  pattern,    scanning  means  for  detecting  those  portions  of  said  pattern  that  occupy 
preselected  portions  of  a  matrix  of  areas  that  encompass  said  pattern,   means  for  storing 
a  plurality  of  probability  matrices,   the  individual  areas  of  each  of  said  probability  matrices 
being  suitably  weighted  in  accordance  with  the  occupancy  probability  of  that  area  by  a 
pattern  from  a  selected  ensemble  of  patterns,   means  for  systematically  comparing  said 
detected  pattern  portions  with  corresponding  areas  of  all  of  said  probability  matrices, 
means  for  developing  a  signal  whose  magnitude  is  proportional  to  the  degree  of  correlation 
between  said  portions  and  each  of  said  stored  matrices.  "_4/ 

Stearns  has  achieved  recognition  of  typewritten  numerals,   to  an  accuracy  of  99%,   by  statistical 
analysis  of  a  large  number  of  representative  samples  to  determine  probabilities  of  black,    white,   and 
gray  (ambiguous)  for  each  cell,   to  which  a  probabilistic  decision-tree  logic  can  then  be  applied.  .5/ 
Baran  and  Estrin  have  worked  with  deteriorated  numeric  characters,    in  a  "learning"  -type  system,    in 
which  they  also  determine  the  probabilities  of  black  in  each  cell  for  a  large  number  of  samples  of  each 
possible  pattern.     They  report  further: 

"...   first,   the  a  priori  probability  distribution  of  each  black  and  white  cell  for  each 
of  the  possible  characters  is  computed.     Next,   a  set  of    hypotheses  is  established  that  the 
unknown  character  is  each  of  the  possible  characters  of  the  allowable  set. 

"The  a  posteriori  probability  of  each  of  these  hypotheses  being  true  is  tested  with 
Bayes1  formula,   using  the  data  from  each  cell  position  as  a  separate  measurement.  "  6/ 

Similarly,    Chow  has  studied  techniques  for  pattern  separability  directed  toward  minimization  of 
pattern  redundancy  and  toward  a  basis  for  decision-making  that  is  probabilistic  and  that  minimizes 
risks  of  misrecognition.   JJ   Chow's  work  has  been  characterized  as  follows:     "The  problem  is 
regarded  as  one  of  testing  multiple  hypotheses  in  statistical  inference,   that  is,   testing  the  hypothesis 
that  the  observed  pattern  is  indeed  the  given  character  against  the  hypothesis  that  it  is  not.  "  8/ 

In  investigations  at  United  Research,    9/ mathematical  analyses  for  the  'decoding'  aspect  of 
character  recognition  systems  -  that  is,    recognition-matching  and  identification-decision  operations  - 
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have  been  explored.     'Expectation  matrices'  for  each  character  in  the  vocabulary  set  are  to  be 
established  on  the  basis  of  sampling    such  that  the  probability  of  a  given  cell  being  black,    given  that 
the  source  pattern  is  a  member  of  the  Rth  character-class  in  the  vocabulary,    is  determined.     These 
results  are  to  be  mapped  onto  a  line,   and  a  scalar  score  developed,    such  that  non-overlapping 
intervals  along  the  line  can  be  used  to  identify  unknown  characters.     The  research  problem  involves 
the  weights  that  can  be  found  to  satisfy  the  expected  value  of  the  score,    given  that  the  source  pattern 
is  the  Rth  character  in  the  vocabulary  set,    and  to  avoid  purely  intuitive  arbitrary  choices  of  tests  to 
be  made  on  source  patterns.     This  research  is  therefore  clearly  directed  to  the  question  of  .  / 

probabilistic  pattern-class  separability,   under  conditions  involving  a  varying  degree  of  noise.  —    The 
initial  research  investigations  make  the  assumption  that  all  R  characters  of  a  given  vocabulary  set 
are  equiprobable  and  that  the  costs  of  error-,   for  any  character  in  the  set,    are  equally  important. 
Possible  implications  for  future  research,   however,   include  first,   extension  of  linear  sum  weighting 
schemes,    or  vector  multiplications,    to  the  use  of  more  general  matrix  operations.     On  the  other  hand, 
the  proposals  suggest  that  the  determination  of  reliability  measures  might  also  include  considerations 
of  relative  frequency  of  character  occurrence  and  of  relative  risk-cost  associated  with  specific 
possible  misrecognitions.     Weightings  based  on  white  as  well  as  on  black  areas,   and  the  elimination 
of  areas  that  are  always  black  or  always  white,    regardless  of  which  character  is  presented,   are  also 
considered. 

Many  of  the  pattern  recognition  techniques  using  independent  measures  of  probabilistic 
separability  have  been  criticized  in  the  literature  on  various  grounds.     First,    it  is  pointed  out  that 
highly  reductive,   but  random-operator,   techniques  such  as  the  75-point-pair  logic  of  Bledsoe  and 
Browning,   are  often  severely  limited  as  to  the  number    of  patterns  they  can  successfully  distinguish, 
even  without  allowing  for  invariance  under  commonly  encountere.d  transformations.     Difficulties  have 
been  met  by  other  investigators  in  trying  to  replicate  the  original  results.    2/   Bailey,   for  example, 
has  remarked  that  every  possible  point-pairing  is  necessarily  more  powerful  than  that  of  Bledsoe- 
Browning,    since  it  includes  all  random-pair  configurations  and  must  therefore  do  at  least  as  well  as 
any  of  them.     He  says: 

"While  many  examples  may  be  submitted  to  demonstrate  the  effectiveness  of  a  system, 
yet  if  one  reasonable  example  can  be  shown  for  which  a  system  fails,   the  validity  of  the 
technique  becomes  subject  to  question.     The  essence  of  these  remarks  is  that  for  many 
paired-point        configurations,   negative  examples  may  be  found.     In  these  cases,   the  point- 
paired  principles  must  be  less  effective  than  a  simpler  technique,    such  as  point-by-point 
comparison.  "_3/ 

A  second  point  of  cricism  against  many  of  these  techniques  is  that  of  the  neglect  of  the  special 
relationships  or  the  criterial  features  that  occur  at  adjacent  points  in  the  image  field.     In  a  survey 
of  pattern  recognition  research  conducted  by  Stanford  Research  Institute,   it  was  concluded  that  since 
patterns  are  generally  made  up  of  connected  points  in  the  field,    schemes  which  do  not  make  use  of 
adjacency  relationships  tend  to  lose  in  recognition  efficiency.     Techniques  considered  by  the  S.  R.  I. 
to  fall  in  this  category  included  Bledsoe -Browning,    Baran-Estrin,   Highleyman-Kamentsky,   Stearns, 
and  Novikoff.  _4/ 

Closely  related  to  the  question  of  neglect  of  the  interdependence  of  features  in  many  patterns  is 
the  criticism  of  the  use  of  independent  measures  on  theoretical  grounds.     Thus  both  Minsky  and 
Hawkins  suggest  the  need  for  considering  joint  conditional  probabilities  and  point  out  the  restrictive 
nature  of  assumptions  of  statistical  independence  in  systems  using  probabilistic  tests.  j>/ 

The  criterion  for  pattern  separation  that  has  been  termed  'statistical  separability',   however, 
not  only  assumes  the  independence  of  specific  character  features  but  also  emphasizes  the  introduction 
of  strictly  random  elements  in  the  organization  of  the  system.     This  approach  has  been  most 
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extensively  used  in  the  simple  Perceptron  systems.     In  the  simple,   or  alpha-type,    Perceptron,   there 
is  first  an  array  of  sensory  units,    simulating  a  retinal  mosaic.     Each  sensory-unit  is  randomly 
connected  to  cells  in  another  array,   that  is,   to  'association1  or  'A-units1.     Each  A-Unit,   in  turn,    is 
randomly  connected  to  a  'response'  or  R-unit.     As  the  Perceptron  is  shown  various  examples  of  a 
pattern,   A-units  that  have  resulted  in  a  'correct'  response  are  reinforced,    increasing  the  likelihood 
that  the  correct  response  will  become  more  and  more  consistently  activated  for  additional  samples 
of  the  same  pattern. 

The  importance  of  the  statistical,    or   random,   organizational  principle  for  single  perceptrons 
has  been  described  by  Rosenblatt  as  follows: 

"It  should  be  noted  that  for  any  given  perceptron,    regardless  of  which  table  of  random 
numbers  was  used  in  its  construction,   it  would  be  possible  in  principle  to  write  a  complete 
McCullough-Pitts  logical  equation,    describing  the  set  of  all  possible  states  of  the  system  as 
a  logical  function  of  the  set  of  all  possible  inputs  .  .  .    The  fact  is,   that  the  logical  specification 
equation  is  a  particularly  idiosyncratic  function  of  the  specific  network  and  is  apt  to  be  totally 
different  for  two  networks  having  identical  performance  characteristics.  .  .     Nonetheless  (if 
the  systems  are  large  enough)  those  functional  characteristics  which  we  are  most  concerned 
with,    such  as  the  ability  correctly  to  discriminate  a  particular  pair  of  forms  would  be  found 
to  vary  little,    so  long  as  the  statistical  rules  remain  unchanged.     It  appears,   therefore,   that 
the  statistical  rules  come  closer  to  a  canonical  specification  of  what  is  most  important  for 
the  system  to  operate  properly.     The  number  of  logical  specifications  which  fit  the  bill  is 
astronomical,   and  for  the  most  part  these  constitute  interchangeable  variations  on  a  theme; 
but  any  violation  of  the  statistical  structure  of  a  perceptron  is  likely  to  radically  alter  the 
performance  of  the  system.  "  ±J 

A  more  sophisticated  type  of  Perceptron  will  be  discussed  briefly  in  connection  with  mention  of 
research  on  machine  models  of  perception. 

8.  3    Automatic  Classification  of  Patterns 


Techniques  for  discovery  of  relatively  invariant  features  of  patterns  and  for  determination  of 
effective  procedures  for  separating  patterns  may  provide  useful  clues  for  improvements  in  character 
recognition  systems.     They  may  or  may  not,   however,   involve  the  kind  of  generalization  that  is 
assumed  to  be  involved  in  human  perception  and  pattern  detection.     Thus,   a  distinction  that  has  been 
made  with  respect  to  human  or  animal  perception  and  recognition  involves  "discriminative  skill"  on  _  / 
the  one  hand,   and  "discriminative  matching"  on  the  other.     In  defining  this  distinction,    Bruner  et  al,— ' 
include  under  "skill"  such  operations  as  abstracting  relevant  from  irrelevant- stimulus  details, 
comparable  to  criterial  features  extraction,   and  overcoming  distraction,    such  as  is  involved  in  image 
enhancement  and  weighting  operations.     Under  the  term  "discriminative  matching"  they  include  not 
only  the  processes  of  sorting  into  categories  but  also  those  of  learning  what  categories  to  use. 

With  respect  to  the  similar  processes  in  pattern  recognition  by  machine,   Kelly  stresses  the 
following  points,   among  others: 

"Webster's  dictionary  defines  recognize  as  meaning  to  know  again,    implying  that  the 
cognitive  mechanisms  has  seen  the  object  before  and  learned  to  know  it.     The  related  process 
of  classifying  objects  which  have  never  been  previously  observed  is  covered  by  the  topic  of 
generalization.  .  .  . 

"A  recognition  device  which  is  restricted  in  the  world  of  patterns  to  which  it  is  exposed 
and  reliably  classifies  all  members  of  that  world  into  their  proper  groups  has  not  the 
flexibility  that  is  associated  with  bionics.     For  such  flexibility  the  recognition  scheme  should 
include  provisions  for  generalization  and/or  abstraction.     To  generalize,    is  defined  as 
meaning  to  derive  or  induce  from  particulars.     It  is  required  that  the  bio-computer,   after 
having  learned  to  classify  a  set  of  members  of  its  world  of  patterns,   be  sufficiently  flexible 
in  design  that  it  will  then  be.  able  to  classify  all  possible  members  of  that  world  with  some 
reasonable  probability  that  patterns  similar  in  the  features  to  which  the  sensory  field  is 
sensitive  will  be  classified  in  the  same  group.  "    9/ 


— '         Rosenblatt,   F.   "Perceptual  generalization  over  transformation  groups",   Ref.  397,  pp.  67-68. 

2/ 

— '  Bruner,    J.S.  ,   G.  A.  Miller  and  O.    Zimmerman.     "Discriminative  skill  and  discriminative 

matching  in  perceptual  recognition",    Ref.    64. 

3/ 

— '  Kelly,    P.M.     "Problems  in  bio-computer  design",   Ref.    252,   pp.   1-1,   1-9. 
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In  considering  progress  in  the  field  of  character  recognition  and  related  research,   therefore, 
it  is  not  surprising  to  find  techniques  and  proposals  for  automatic  classification  of  patterns  ranging 
from  those  which  involve  no  generalization  at  all,   through  those  which  involve  generalization  only 
to  a  limited  degree,   to  those  which  attempt  to  provide  for  machine  acquisition  of  suitable  classifica- 
tion categories  and  for  subsequent  modification.     In  the  first  category  are  the  various  template 
techniques.     Systems  or  techniques  that  utilize  a  limited  degree  of  apparent  'generalization'  include 
those  that  select  the  particular  tests  or  criterial  features  which  effectively  discriminate  among  the 
members  of  a  particular  vocabulary. 

In  the  recent    Stanford  Research  survey  of  pattern  recognition  techniques  certain  distinctions 
between  possibilities  for  automatic  pattern  classification  involving  various  degrees  of  'generalization' 
are  brought  out  in  a  discussion  of  so-called  'learning'  systems.     Thus: 

"In  the  'learning  machine'  type  of  program,    the  reference  information  against  which 
the  incoming  retinal  data  is  compared,    is  developed  by  correlating,   and  superimposing  in 
storage,    signals  derived  from  a  series  of  typical  patterns.     The  operator  identifies  each 
pattern  as  it  arrives  during  the  learning  cycle,    and  controls  the  information  that  is  stored. 
The  equivalent  of  a  template  is  prepared  against  which  the  incoming  retinal  field  is  matched; 
the  excellence  of  the  match  is  a  function  of  the  homogeneity  of  the  information  presented 
during  the  learning  cycle,   and  of  the  correlation  between  the  particular  pattern  under 
examination  and  the  stored  data.     Variations  of  size  are  likely  to  be  highly  significant.  .  .  . 
By  dividing  the  retinal  information  into  several  channels  at  each  data  point,    it  is  possible 
for  the  machine  to  learn  a  particular  pattern  in  several  orientations  or  positions,    but  in 
many  schemes  it  is  necessary  for  the  machine  to  learn  every  pattern  in  every  orientation. 
It  does  not  transfer  to  other  patterns  the  transformations  it  has  'learned'  for  a  specific 
pattern  -  it  has  not  learned  the  transformation.     This  is  really  the  fundamental  difference 
between  template  matching  and  learning,    and  it  would  be  desirable  to  restrict  the  use  of 
the  term  learning  machine  to  a  system  that  generates  an  internal  structure  which  can 
transfer  to  new  patterns  the  facility  it  acquired  with  respect  to  a  previous  pattern.  "  ±J 

-  That  is,    to  the  potentialities  for  automatic  classification  or  generalization  of  patterns,   based 
upon  adjustments  and  modifications  derived  from  sequential  experience  with  a  variety  of  samples 
of  each  pattern  class. 

In  addition  to  this  distinction,    we  should  first  note  that  considerable  emphasis  has  been  placed 
on  self -organizing  principles  in  the  development  of  models  of  "learning"  processes  regardless  of  the 
degree  of  generalization  achieved.    2/   Simultaneously  we  note  that,    on  the  other  hand,    studies  of  -  / 
general  techniques  of  automatic  classification  (Baum,  _'    Laemmel,   ±1   McLachen.    4/   Mattson,  — ' 
Sebestyn,    W  and  others  are  in  order.     Thus,   more  generalized  studies  of  methods  for  determining 
appropriate  membership  in  classes,    given  that  'class'  definitions  have  been  derived  from  the 
property-vectors  of  the  samples  of  the  class,   are  obviously  pertinent  to  further  progress  in  this 
area  of  pattern  recognition  research.     Sebestyn,    in  particular,   has  experimentally  demonstrated 
automatic  recognition  of  membership  in  pattern  classes  whose  definitions  are  derived  from  machine 
observations  of  representative  samples,   for  the  case  of  spoken  numerals. 

On  the  other  hand,   problems  of  relative  weighting  of  input  pattern  elements,    dependent  on 
sequential-experience  histories,    involved  in  some  adaptive  systems  for  pattern-classification,   are 
sometimes  simplified  for  purposes  of  facilitating  mathematical  analysis.     A  consideration  of  adaptive 
switching  circuits,   by  Widrow  and  Hoff,   has  been  considered  to  be  of  this  type.     By  comparison  with 
the  Taylor,    Rosenblatt,   Farley-Clark,    Roberts,   and  related  systems,    Farley  considers  the  Widrow- 
Hoff  proposals  to  represent  an  essentially  simpler  system.     In  a  review  of  the  Widrow-Hoff  paper, 
he  comments,    in  part,   as  follows: 
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Brain,   A.  E.  ,   A.    Macovski,   et  al.      "Graphical  data  processing  research  study  and 
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128 


"...   Several  systems  make  use  ...   of  combinations  of  nonlinear  threshold-elements 
each  of  which  has  a  number  of  inputs  whose  effectiveness  or  'weights'  are  adjustable.     If 
an  input  mosaic  is  suitably  connected  to  elements  of  this  type  and  appropriate  rules  are  used 
to  adjust  the   weights  while  presenting  sequential  'experience'  in  the  form  of  examples  of 
inputs  and  their  desired  classification,    it  has  been  found  that  the  weight  parameters  can 
converge  on  values  which  will  effect  the  desired  classification  more  or  less  closely  .  .  . 

"In  the  present  paper  .  .  .   by  Widrow  and  Hoff  ...   a  simpler  system  is  considered  in 
which  each  mosaic  point  connects  through  its  own  individual  adjustable  weight  to  a  single 
time -invariant  threshold-element.     This  simplified  system  is  more  amenable  to  analysis, 
and  the  authors  show  that  the  problem  of  finding  suitable  weights  in  this  system  involves 
searching  for  the  minimum  of  a  multidimensional  parabolic  surface.  .  .    This  and  ....   other 
systems  ....    exhibit  the  'overlap'  generalization  in  which  two  sets  tend  to  be  'similar'  if 
they  have  elements  in  common.  "    ±f 

A  special  case  of  very  limited  generalization  is  provided  in  methods  such  as  used  in  an 
early  Clark  and  Farley  model  2/  and  in  the  simple  Perceptrons.     This  has  been  termed  'contiguity' 
or  'preponderance'  generalization  ±f  and  requires  that  the  pattern  to  which  generalization  is  to  be 
applied  must  share  a  number  of  retinal  cells  in  common  with  patterns  that  have  been  previously 
classified  in  the  desired  response  category.     That  is,   the  method  relies  on  the  fact  that  minor 
displacements,    shape  variations,   or  modifications  due  to  noise  may  still  result  in  an  input  pattern 
which  shares  most  of  its  elements  with  a  prototype  pattern.     In  effect,   a  statistical  overlap  or 
border-zone  template  is  thus  provided,   to  which  an  exact  fit  is  not  required,    but  for  which  a  best- 
match  is  obtained.     In  addition  to  the  Clark  and  Farley  and  Rosenblatt  experiments  with  this  type  of 
approach,    Barus  j/  has  investigated  statistical  separability  criteria  in  which,    say  the  contour  of  a 
new  circle  is  found  to  approximate  more  closely  to  the  family  of  contours  of  previously  seen  circles 
than  it  does  to  those  of  previously    identified  squares. 

In  the  University  of  Manchester  computer  studies  of  character  recognition  techniques,  the 
effects  of  a  limited  generalization  are  achieved  not  only  by  screening  of  suitable  property  tests, 
noise -elimination  operations,   and  grouping  of  interrelated  properties,   but  also  by  use  of  automatic 
classification  principles  in  the  recognition-identification  decision  processes.     The  system  is 
described,   in  part,  as  follows: 

"The  result  of  the  scan  is  to  produce  descriptions  of  segments  of  the  figure,    i.e.  , 
divisions  which  are  conveniently  produced  by  the  scanning  process.  .  .     The  scanning 
process  also  includes  measures  to  allow  for  figure  imperfections,   dirt  on  the  paper, 
and  other  forms  of  'noise'. 

"In.  the  'assembly'  part  of  the  programme  which  follows,   the  segments  of  the  figure 
obtained  in  the  scan  are  analysed  and  connected  wherever  appropriate,    into  true  figure 
parts:    a  description  is  given  of  the  length  and  slope  of  straight-line  parts  and  the  length  and 
curvature  of  curved  portions.     The  scan  and  assembly  sections  of  the  programme  together 
produce  the  'statement',    which  gives  a  complete  description  of  the  figure.     This  description 
is  independent  of  the  orientation  and  size  of  the  figure,   the  lengths  of  the  various  parts 
being  given  relative  to  one  another.     It  is,   in  effect,   a  coded  representation  of  the  pattern. 
It  may  be  regarded  as  a  one -dimensional  pattern  which  consists  of  symbols  chosen  from  a 
restricted  range  .  .  .    "In  the  'recognition'  section  which  follows,   a  comparison  is  made 
between    the  one -dimensional  pattern  or  statement  describing  the  unknown  pattern  and  the 
statements  already  stored  within  the  computer,   together  with  the  names  of  the  patterns 
they  describe.    A  considerable  time  would  be  wasted  if  every  stored  statement  had  to  be 
examined;  to  prevent  this,   an  automatic  classification  system  is  provided;  this  examines 
the  stored  statements  and  arranges  them  into  classes  according  to  common  features."  jy 
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In  the  area  of  basic  or  long-range  pattern  recognition  research,    Farley  has  discussed  some  of 
the  possibilities  for  generalization  and  automatic  classification  more  nearly  akin  to  what  may  be  called 
'learned  perception1.  _!/  He  considers  first  the  simpler  examples  of  generalization  by  ignoring 
differences,   as  in  border-zone  or  overlap  situations,   and  by  storing  results  of  tests  and  measure- 
ments on  properties  of  large  numbers  of  representative  specimens  of  a  particular  class.     Next,  he 
considers  rules  for  compiling  and  grouping  sets  of  properties,    such  as  properties  which  frequently 
co-occur  spatially  or  which  are  related  in  a  temporal  sequence.     To  the  property-grouping  rules 
derived  from  frequency  of  occurrence,    contiguity,   and  the  like,    may  be  added  results  of  certain  key 
decision-properties,   including  a  'name1  that  has  been  repetitively  associated  with  a  property-class. 

If  the  various  property  classes  are  intercompared,   various  subgroups  of  properties  or  property- 
measurement  values  might  be  identified  which  are  common  to  many  input  patterns.     In  this  way,   for 
example,   a  new  property  or  'concept*  might  be  developed,    such  as  that  of  the  idea  of  a  triangle,    with 
three  straight  line -segments  joined  at  three  corners.     Finally,    Farley  suggests  that  where  cases 
occur  of  source  patterns  having  many  input  properties  which,    in  combination,   give  a  strong 
correlation  with  some  property  class,    the  system  might  'lock  on'  to  that  class  even  although  some  of 
the  properties  or  some  of  the  property  values  might  be  in  conflict  with  that  class.     "As  a  result", 
Farley  states,    "A  percept  would  have  been  attained  which  did  not  altogether  correspond  with  the 
real  sensory  stimulus.  "_f  / 

8.  4    Machine  Models  of  Perception,    Recognition,    and  Pattern  Generalization 

Various  aspects  of  pattern  recognition  research  are  involved  to  a  greater  or  lesser  extent  in 
proposed  machine  models  which  simulate  perception  and  pattern  recognition,   or  generalization, 
abstraction,   and  pattern  detection.     Early  suggestions  developed  by  Pitts  and  McCullough  \j  related 
to  a  model  that  would  provide  relative  invariance  with  respect  to  size.     Schade  has  developed  an 
analog  of  the  eye  in  which  photoelectronic  simulation  of  scanning  and  sensing  is  used,    4/ and 
suggestions  of  Hebb_§/  have  been  explored  with  respect  to  simulated  neural  networks,    especially 
those  which  utilize  random  connections  in  self-organizing  operations.     Machine  models  capable  of 
demonstrating  a  type  of  'conditioned  reflex'  behavior  have  been  investigated  by  Uttley  "/and  more 
recently  by  Steinbuch  and  associates  at  the  Technische  Hochschule,    Karlsruhe.    7/ 

Machine  models  that  involve  some  degree  of  "learning",    by  virtue  of  continuing  adjustment 
to  new  combinations  or  new  probabilities  of  occurrence  of  specific  examples,    especially  with  regard 
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to  prechosen  criterial  features  or  methods  involving  overlap-generalization,   have  been  mentioned.  ±/ 
In  addition  to  the  machine  models  which  have  been  previously  cited,    such  as  those  of  Harmon,    Clark 
and  Farley,    Loebner,   the  alpha-type  perceptrons  of  Rosenblatt,    etc.  ,    current  progress  in  this  area 
of    research  that  is  potentially  applicable  to  problems  of  automatic  character  recognition  may  be 
exemplified  by  the  work  of  Estavan,    Roberts,   Singer,   Rosenblatt  with  respect  to  more  sophisticated 
type  of  perceptrons,   Barus,   and  Uhr-Vossler,    as  well  as  by  the  so-called  "spatial  computer"  models. 

The  Estavan  learning -machine  model,   a  program  which  can  be  simulated  on  the  IBM  709 
computer,   involves  first  a  quantized  pattern-input-matrix,    scanned  in  sequential  order  from  left  to 
right  and  the  bottom  to  the  top.     The  behavior  of  the  model,   is  based  initially  on  a  simple  S     — »       R 
learning  paradigm.     However,    search  for  critical  features  and  for  interrelationships  between  these 
features  is  provided.     Moreover  it  is  claimed  that: 

"The  program  may  be  given  some  hypotheses,;  that  is,  it  may  be  given  means  for 
choosing  responses  on  bases  other  than  habit  strengths  increased  or  decreased  by  fixed 
amounts.  For  example,  it  may  be  written  with  the  ready-made  hypothesis  that  similar 
sense  patterns  call  for  similar  behaviors.  "  jV 

In  the  Roberts  experiments  at  the  Lincoln  Laboratory,    M.  I.  T.  ,   models  of  adaptive  networks 
were  investigated  with  respect  to  recognition  of  the  same  handprinted  characters  used  in  the 
Sherman  tests  of  the  quasi-topological  method  for  character  recognition.  2/For  discrimination 
between  two  pattern  types,   a  network  of  2,  048  cells  (simulated  on  a  digital  computer)  was  divided 
into  two  groups.     Each  cell,   being  connected  randomly  to  eight  input  bits,   had  a  weight  preas signed, 
and  reinforcements  were  applied  to  weights  of  cells  in  the  correct  response  set  for  each  trial.     After 
normalization  with  respect  to  center  of  gravity  of  the  input  characters,   two  character  patterns 
could  be  distinguished  with  95%  accuracy  after  20  trials,   and  after  100  trials  the  model  was  able  to 
accommodate  rotations  and  systematic  distortions.     For  six  character-pattern  categories,   a  rein- 
forcement procedure  was  utilized  that  modified  the  random  connections  between  the  cells  as  well  as 
cell  weights,   and  that  facilitated  a  search  for  'good'    connections. 

Further  investigations  have  resulted  in  models  in  which  connections  are  not  random,   but  in 
which  the  input  pattern  elements  to  a  single  cell  represent  a  specific  local  operation  on  a  character. 
In  these  later  models  various  reward-reinforcement  functions  have  been  studied.     Roberts  concludes 
that  it  is  possible  to  recognize  the  characters  of  a  complete  alphabet  with  an  accuracy  of  94%,   after 
a  training  period  of  40  trials  per  character,   using  these  networks  and  a  suitable  reward  function. 

The  machine  model  experiments  investigated  by  Singer  are  concerned  with  elaboration  of  the 
Pitts -McCullough  method  of  optically  .centering  the  image  by  schematizing  the  structure  of  this  type 
of  form  recognition  in  some  detail.  _/   In  effect,   the  principle  is  to  convert  spatial  aspects  of 
pattern  configurations  into  temporal  changes,   and  to  compensate  differences  in  arrival-time  for 
processed  patterns  so  as  to  provide  size  invariance.     Singer  summarizes  this  aspect  of  the  proposed 
technique  as  follows: 

"Size  invariance  is  accomplished  by  a  group  transformation  of  the  image.     The 
transformation  utilized  here  is  a  dilation  group  transformation  which  is  best  described 
as  a  set  of  electrical  impulses,    each  representing  a  resolved  point  of  an  image  border, 
all  traveling  uniformly  outward  from  the  center  of  delay  lines  arranged  in  a  polar 
network.     By  considering  time  coincidences  of  these  pulses  at  selected  points  of  the 
polar  array,    recognition  of  an  image  is  accomplished.  "_5/ 
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Compare,   for  example,   the  approach  of  Baran  and  Estrin:     "Samples  of  a  set  of  characters 
are  first  identified  by  a  human  operator.     From  such  inputs,   a  probability  matrix  is 
computed  and  used  to  derive  a  set  of  weighted  filters  or  stencils  which  distinguish  each 
character  relative  to  the  set  of  possible  characters.     When  unknown  characters  are  read,    the 
proposed  pattern  recognition  machine  produces  estimates  of  the  confidence  of  the 
identification."    (Ref.    34,   p.   29.) 

Estavan,   D.     "Pattern  recognition,   machine  learning  and  automated  teaching,  "    Ref.  125,   p.  6. 

Roberts,    L.  G.     "Pattern  recognition  with  an  adaptive  network.  "    Ref.    387. 

See  Singer,    J.  R.  ,   Refs.    425-430. 

Singer,    J.  R.     "Model  for  a  size  invariant  pattern  recognition  system,  "    Ref.  429,   p.    239. 
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In.  later  developments,  Kelly  and  Singer  report  the  incorporation  of  "learning"  principles  involving 
a  'captive  state1  feature  such  that  once  a  pattern  has  been  'learned'  the  model  will  cease  to  modify 
the  reference  pattern  but  will  continue  to  recognize.   Jy 

Kelly  and  Singer  have  also  reviewed  various  examples  of  machine  models  of  perception  and 
recognition.     They  have  stressed  the  differences  between  adaptive -abstractive  systems  and  systems 
that  may  be  adaptive  in  the  sense  that  performance  improves  with  increased  experience  on 
representative  samples  but  that  do  not  rely  on  criterial  features  and  properties  or  interrelationships 
between  pattern  elements.     Thus  they  contrast  the  assumptions  of  Von  Foerster  and  of  Greene,    with 
respect  to  pre -organization,   abstracting,   and  property-filtering  as  necessary  to  generalization,    2y 
with  the  emphasis  in  the  randomly  connected,   overlap  template  models  such  as  the    simple 
Perceptrons. 

Recent  work  at  Cornell  Aeronautics  Laboratory  and  elsewhere,   however,   has  suggested 
improvements,    modifications,   and  new  organizational  schemes  to  produce  Perceptrons  of  greater       ,.  / 
flexibility.     It  has  been  noted  that  the  simple  Perceptron  has  little  or  no  capacity  for  generalization.—' 
The  simple  Perceptron  is  a  3-layer  device,    involving  two  levels  of  transformation,    from  sensory 
cell  (S-unit)  to  association  (A -unit)  and  from  A-unit  to  response  (R-unit).     Because  of  the  randomness 
of  the  S     — >         A    connections,   the  original  geometric  configuration,    specifically  including  the 
relationships  of  connectivity,    is  lost  in  the  transformation  of  the  input  pattern  in  A-unit  space.     By 
providing  for  both  inhibitory  and  excitatory  connections,    in  the  randomly  organized  net,    problems 
such  as  one  pattern  covering  another  ("E"  and  "F",    "O  and  Q")  can  often  be  resolved.     However, 
most  of  the  experience  with  Perceptrons  of  this  type  has  been  with  relation  to  discriminations 
between  specific  pairs  of  patterns  only,   and    difficulties  remain  with   respect  to  size,   translation, 
and  rotation  variance,    _J7   as  well  as  to  the  fundamental  question  of  whether  the  kinds  of  patterns 
that  can  be  discriminated  consist  of  interesting  or  useful  classes  of  patterns.     On  the  other  hand,    if 
these  kinds  of  patterns  can  be  equated  with  significant  or  criterial  features  of  patterns,   then 
perceptron-type  systems  might  find  application  in  precisely  the  preorganization  stages  of  a  more 
complex  perception-recognition-detection  system  which  "make  the  sensors  particularly  sensitive 
to  important  features  of  possible  patterns.  "  5/ 

Moreover,    Rosenblatt  and  his  co-workers  have  continued  investigations  into  possibilities  for 
more  versatile  Perceptrons.     They  have,    for  example,    explored  an  automatic  classifying  system 
in  contrast  to  the  forced  training  techniques  used  in  the  earlier  studies.     That  is,    in  the  early 
experiments,    an  operator  forced  responses  to  the  training  samples  to  which  the  Perceptron  was 
exposed.     Results  of  the  later  work  have  been  reported,   in  part,    as  follows: 

"It  should  be  emphasized  that  no  attempt  is  made  in  the  course  of  this  experiment  to 
direct  the  system  or  to  influence  it  in  any  way  in  its  choice  of  response;  stimuli   from  the 


1/ 

2/ 


2/ 
4/ 


1/ 


Kelly,  P.M.   and  J.R.   Singer.      "Bio-computer  design.     Part  1:     Problems  in  bio -computer 
design,    by  Peter  M.    Kelly.     Part  2:    A  specific  device  for  bio -computers,   by  Jay  R.   Singer. 
Ref.   251. 

Greene,    whom  they  cite,   is  concerned  with  such  questions  as  the  following:     "The  investigation 
summarized  here  aims  to  shpw  how  a  rather  simple  type  of  network  could  exhibit  many  of 
the  pattern-stabilizing  and  perceiving  abilities  that  make  human  vision  meaningful  .  .  .    The 
model  to  be  investigated  is  not  intended  to  do  the  entire  job  of  pattern  recognition;  rather  its 
purpose  is  to  'notice'  and  hold  together  good  Gestalten,    or  perceptual  units,    in  order  that  the 
pattern  recognizer  would  have  stable  and  meaningful  units  with  which  to  operate.  "    (Ref.    179, 
p.    2,    see  also  Refs.    178,    180,    181.) 

On  this  point,   Kelly  and  Singer  (Ref.    251),    cite  investigations  by  Warshaw  (Warshaw,    M.  , 
"An  analysis  of  the  Perceptron",    Ref.    525.) 

These  include,   for  example,    the  number  of  training  organizations  that  would  have  to  be 
developed  to  accommodate  different  'points  of  view'  of  the  same  pattern  (Kelly,    P.  M.  ,    Ref.  252, 
p.  222),   the  difficulties  in  arriving  at  proper  reinforcement  schemes  for  size  variance 
(Murray,   A.  E.  ,    "A  half-perceptron  pattern  filter",    Ref.  319),    the  number  of  representative 
samples  that  must  be  provided  for  even  a  restricted  overlap  generalization.     On  this  last  point, 
Rosenblatt  himself  notes  that:     "It  is  necessary  for  the  perceptron  to  see  the  letter  'N'  in  a 
large  number  of  intermediate  positions  so  as  to  form  a  chain  or  'contiguity'  sequence.  " 
Ref.  397. 

Kelly,    P.M.      "Pr  oblems  in  bio-computer  design,  "  Ref.  252,   p. 216. 
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two  classe-s  are  presented  in  a  random  sequence,   and  the  only  rule  of  operation  is  that 
whenever  the  response  R-l  occurs,   the  A-units  are  'reinforced'  .  .  .    There  is  an  optimum 
value  of  the  decay  rate  ...    If  the  decay  is  too  small,   the  system  becomes  rigidly  set  in  a 
'wrong'  pattern  of  response,   while  if  the  decay  rate  is  too  great,   the  perceptron  forgets  too 
rapidly,    and  learning  is  unstable  .  .  . 

"We  had  found  a  system  which  could  spontaneously  differentiate  dissimilar  forms, 
while  assigning  similar  forms  to  the  same  class.     It  soon  became  evident,   however, 
that  a  perceptron  designed  in  this  fashion  is  exceedingly  temperamental,,  and  will  perform 
properly  only  under  a  limited  spectrum  of  environmental  conditions.  "  —' 

Subsequent  Perceptron  research  has  included  study  of  "transform  association  systems"  in 
which  the  objective  is  to  provide  generalization  over  area-preserving  transformations.     The  methods 
considered  consist  in  so-called  "cross-coupled  gamma  systems".     That  is,   the  improved  Perceptrons 
have  reinforcement  rules  whereby  gains  for  active  units  are  balanced  by  compensating  losses  for 
inactive  units,    and  both  inhibitory  and  excitatory  connections  are  permitted  between  A-units  as  well 
as  from  A-units  to  R-units.     Under  these  design  principles,    it  appears  that  features  such  as  local 
connectivity  of  source  patterns  may  be  preserved,   and  that  biases  favorable  to  detection  of  pattern 
similarities  for  a  given  one-to-one  transformation  may  be  built  up. 

Barus,   as  in  the  early  Perceptrons,   assumes  a  reductive  border-zone  template  principle  with 
overlap  generalization,   but  he  also  assumes  a  cross-coupling  as  in  the  later  Perceptron  systems.^/ 
Moreover,   in  the  Barus  proposals  the  cross -coupling  is  directed,   not  only  in  terms  of  prior  rewards 
but  also  in  terms  of  current  firings.     In  the  Barus  system,   the  input  pattern  elements  as  such  are 
indeed  originally  disassociated  from  their  spatial  adjacency  relationships,   as  in  Baran  and  Estrin 
and  the  Perceptrons,   by  random  connections  of  S-unit  energizations  and  A-unit  activations,   but  the 
observed  identification  formula  will  consist  not  only  of  A-units  directly  activated  but  also  of  A-units 
previously  linked  to  these  active  A-units  by  reward-reinforcements  for  previously  associated  firings. 
Thus  the  Barus  system  differs  from  Perceptron  systems  in  at  least  the  following  aspects: 

(1)  There  is,    in  the  Barus  system,   an  imposed  separability  of  A-cell  to  R-cell  connections. 
Although  these  connections  are  randomly  organized,   a  particular  A-cell  does  not 
participate  in  the  reward  given  for  more  than  one  response  connection; 

(2)  The  Barus  system  is  cross -coupled  from  the  beginning  of  operation,    and  laws  are 
provided  such  that  if  one  A-cell  fires,   then  those  cells  that  have  been  previously 
linked  to  it  in  a  rewarded-response  have  lowered  thresholds  for  firing. 

(3)  In  the  Barus  system,   an  initial  randomness  of  firing  is  provided,    such  that  accidental 
noise,   the  activation  of  an  S-unit  connected  to  a  particular  A-unit,   and  the  adjustable 
weights  affecting  A-unit  firing  thresholds  in  accordance  with  previously-linked  firing 
responses,   may  all  contribute  to  a  particular  A-unit     — >       R-unit  reaction. 

Thus,  notwithstanding  the  incorporation  of  different  types  of  randomness  into  the  self-organizing 
aspects  of  his  perception-nets,  Barus  in  effect  provides  a  special  form  of  context-dependency  in 
behavior,   that  might  well  be  termed  'context-expectancy.  ' 

Finally,   we  note  that  in  the  recent  Uhr-Vossler  models  for  pattern  recognition  of  line 
drawings,   both  'learning'  and  self-organization  in  the  sense  of  selection  from  machine -generated 
random  operators  are  displayed.     This  system  has  been  reported,   in  part,   as  follows: 

"We  are  now  in  the  process  of  testing  and  extending  a  pattern  recognition  program, 
for  the  IBM  709,    that  is  supposed  to  learn  to  recognize  patterns  such  as  alphanumeric 
letters  and  line  drawings  of  simple  objects.     In  this  program,   patterns  are  initially 
presented  to  a  computer  without  any  operators  or  any  output  space  about  meanings  of 
output  sets.     Operators  are  generated  by  two  methods:     (a)    by  extracting  (if  you  will, 
'imitating')  fragments  of  the  inputs -as -known  to  the  computer  via  the  input  transducing 
elements,   and  (b)  by  generating  of  random  Boolean  functions,   or  'n-tuples'  of  these 
input  element  states  .  .  .    These  operators  then  map  inputs  into  output  sets,   and  feedback 
as  to  the  appropriateness  of  the  mapping  controls  generation  and  refinement  of  subsequent 
operators. 


— '  Rosenblatt,   F.     "Perceptual  generalization  over  transformation  groups,  "  Ref.  397,  pp.    71,72. 

2/ 

— '  Barus,    C.     "Machine  learning  and  pattern  recognition;  a  progress  report  on  'A  study  of 

learning  machines'  ",    Ref.    39;  and  "Pattern  recognition  by  statistical  overlap",    Ref.  40. 
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"...  At  present,   two  psychologically  plausible  constraints  have  been  placed  upon  the 
generators  of  operators  .  .  .    These  have  been  (a)  generation  of  operators  that  are  functions 
of  connected  elements  only,   and  (b)  generation  of  these  operators  within  the  space  of  a 
smaller  matrix_l/  and  translation  of  this  matrix  across  the  larger  space. 

"...  in  the  last  machine  run  to  date,  the  program  began  without  any  operators,  but 
merely  the  rule  to  generate  five  new  operators  by  extraction  from  each  unknown  input,  up 
to  a  limit  of  40.  " ±7 

These  investigators    claim  that  the  restrictions  make  the  machine -generated  operators  equivalent  to 
neural  networks  that  are  repeated  in  parallel. 

Turning  now  to  the  machine  models  that  emphasize  the  use  of  local  operations  and  spatial 
transformations  in  the  recognition  process,    we  note  that  those  of    Unger  3^/ and  Kamentsky  _4/  also 
presuppose  the  use  of  parallel  processing.     Holland  has  also  indicated  the  area  of  pattern  recognition 
as  one  of  the  important  applications  of  the  iterative  circuit  machine  that  he  has  described.  .?_/ 
However,    while  the  logical  modules  of  the  Holland  machines  have  built-in  program  and  path-building 
as  well  as  storage  capabilities,   in  the  earlier  Unger  model  the  logical  modules  are  under  the 
direction  of  a  master  control  unit,   which  can  be  programmed  much  as  the  conventional  digital 
computer  is  programmed. 

In  Unger's  SPAC,   or  "Spatial  Computer",   there  is  a  rectilinear  network  of  logical  modules 
in  which  each  module  has  direct  contact  with  its  four  immediate  neighbors,   and  in  which  all  modules 
simultaneously  receive  an  identical  command  or  instruction  from  the  master  control  unit.     Programs 
have  been  written  and  tested  to  simulate  SPAC  in  the  recognition  of  handprinted  alphanumeric 
characters  and  in  the  detection  of  L-  and  A-shaped  features  in  sets  of  randomly  drawn  patterns.     For 
character  recognition,   the  spatial  transformations  in  the  Unger  technique  consist  first  in  smoothing, 
image  enhancement,   and  clean-up  operations.     These  operations  fill  in  holes  in  otherwise  black 
areas  and  small  notches  or  indentations  in  otherwise  straight  edges,   eliminate  isolated  'black' 
cells  including  those  that  create  small  protusions  from  an  otherwise  straight  edge,   and  under  certain 
conditions  fill  in  missing  corner  points. 

For  34  alphanumeric  characters,    34  features  or  properties  are  used  by  Unger  for 
discrimination.     These  are  primarily  features  that  can  be  detected  by  contour -tracing  (horizontal 
cavity  open  above,  vertical  cavity  open  to  the  right,   for  example),   but  the  list  includes  some  relative- 
position-dependent  and  proportion-dependent  properties  as  well,    such  as  "leftmost  point  of  a  vertical 
cavity  open  to  the  right  lies  in  the  upper  two  thirds  of  the  figure",   and  "height  of  the  left  leg  of  a  V- 
shaped  figure    less  than  half  the  height  of  the  right  leg.  "  ~J   Although  the  processing  operations  are 
carried  out  simultaneously  and  in  parallel  over  the  entire  image  field,   the  choice  of  'next  step', 
given  the  outcomes  for  any  one  operation,   follows  a  decision-tree  structure. 

In  Kamentsky's  spatial  computer  model,   the  image  field  is  also  systematically  transformed  by 
a  predetermined  network  of  threshold-responsive  elements,    called  'speurons'  in  that  they  are  neuron- 
like with  respect  to  multi-input,    single-output  characteristics  and  in  that  they  are  connected  in 
spatial  arrays.     The  speurons  may  be  connected  so  that  there  is  excitatory  or  inhibitory  gating 
between  these  elements  and  particular  points  in  the  image  field  or  between  inputs  and  the  output  of  a 
given  element.     They  operate  simultaneously  on  all  points  of  the  image  field,   under  control  of  a 
program,   both  to  reduce  noise  or  normalize  and  also  to  extract  features  such  as  straight  lines, 
curves,    closed  loops,   and  corners.   U 
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I.  e.  ,    smaller  than  that  of  the  entire  pattern  input    space. 

Uhr,    L.   and  C.    Vossler.     "Suggestions  for  self-adapting  models  of  brain  function,  "  Ref.  499, 
pp.    92-93,    passim.     These  investigators  have  also  reported  even  more  recent  promising 
results  at  the  Western  Joint  Computer  Conference,    May  1961.    (Ref.    497.) 

Unger,   S.H.     "Pattern  detection  and  recognition",    Ref.    501;  "A  computer  oriented  toward 
spatial  problems",   Ref.    500. 

Kamentsky,    L..A.     "Pattern  and  character  recognition  systems.  ..  ",    Ref.   244. 

Holland,    J.H.    "Iterative  circuit  computers",    Ref.    218;  and  "A  universal  computer  capable 
of  executing  an  arbitrary  number  of  sub-programs  simultaneously",    Ref.    219- 

Unger,   S.H.    "Pattern  detection  and  recognition",    Ref.    501,   p.    1747. 

Kamentsky,    L.A.    "Pattern  and  character  recognition  systems.  ..  ",    Ref.    244. 
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Thus  we  find  a  variety  of  machine  models  of  perception,   abstraction,    generalization,    recognition, 
and  pattern  detection.     Experiments  on  these  models  provide,   as  do  the  results  of  other  pattern 
recognition  research  efforts,    suggestions  that  may  be  useful,    even  if  they  are  sometimes  contradictory, 
in  the  development  of  improved  character  recognition  systems. 

9.     CONCLUSION 

In  summary,   then,   we  find  that  the  present  state  of  the  art  iri  automatic  character  recognition 
is  characterized  by  progress,   paradox,   and  promise.     In  the  past  several  years,   activity  in  systems 
design  and  development  has  increased.     This  increase  in  activity  is  found  not  only  in  the  development 
of  new  and  improved  techniques  and  the  exploitation  of  various  methods  to  extend  vocabulary  size,   but 
also  in  the  entrance  of  new  organizations  to  the  field.     Progress  is  also  marked  in  the  availability  of 
higher  recognition  rates,   better  carrier  item  handling  capabilities,   and  the  actual  installation  of 
alphanumeric  page -readers  for  field  trial. 

At  least  half  a  dozen  potential  suppliers  would  presumably  consider  undertaking  the  development 
of  alphanumeric  character  recognition  equipment,   for  vocabularies  of  several  hundred  characters,   at 
recognition  rates  ranging  from  several  hundred  to  several  thousand  characters  per  second,    and  for 
either  paper  or  microfilm  carrier  items  in  sizes  including  the  full  page.     This  would  be  at  typical 
costs  of,    say,   $350,000  or  less  for  the  prototype  system  and  with  anticipated  delivery  schedules  of 
12-  to  -18  months.     Additional  organizations  are  optimistic  about  the  extension  of  techniques 
developed  for  small  vocabularies,   but  in  many  cases  they  have  not  yet  actually  demonstrated  extra- 
polation of  their  techniques  from  the  limited  specially  designed  vocabularies  to  larger  and  more 
variable  vocabulary  situations.     These  indications  of  progress,   however,   are  subject  to  the  following 
significant  limitations: 

(1)  There  is  a  direct  relationship  between  the  size  and  variability  of  the  vocabulary  and  the 
cost  and  complexity  of  the  equipment. 

(2)  There  is  an  inverse  relationship  between  the  consistency  with  which  truly  high  quality 
input  can  be  maintained  and  the  tolerance  for  rejects  that  must  be  allowed  for  in 
performance  specifications. 

Paradox  is  to  be  noted,   first,   in  the  fact  that  the  increased  activity  on  the  developmental  side 
has  not  been  accompanied  by  any  significant  increase  in  the  number  of  reader  installations  that    are 
in  productive  use.     Only  in  recent  months  have  equipment  deliveries  been  made  by  suppliers  other 
than  Farrington-IMR,   and  there  is  as  yet  insufficient  experience  with  them  to  draw  conclusions  with 
respect  to  performance. 

A  second  paradox  that  appears  to  mark  the  current  state  of  the  art  in  the  field  lies  in  an 
apparent  failure  on  the  part  of  potential  users  of  automatic  character  reading  equipment  to  fully 
appreciate  the  second  of  the  limitations  noted  above --that  is,    relationship  between  quality  of  input  and 
error  and  reject  rates.     The  limited  data  available  about  operational  performance  under  conditions  of 
usage    in  the  field  is  almost  exclusively  restricted  to  applications  in  which  there  is  little  or  no 
administrative  control  over  imprinting  or  inscription  operations.     Thus,   in  the  systems  used  at  many 
oil  industry  service  stations,   where  the  carbon  impression  from  the  customer's  charga-plate  is 
recorded  for  subsequent  automatic  reading,    smudging  from  improper  insertion  or  registration  of 
plate  and  carbon  is  complicated  by  the  likelihood  of  greasy  fingerprints  and  other  dirt  in  handling. 
The  supposedly  'high'  reject  and  error  rates  for  such  applications  are  therefore  more  likely  to  be  a 
measure  of  the  lack  of  high  quality  input  than  they  are  of  the  performance  of  the  reader.     Similarly, 
in  Figure   1,   there  is  shown  only  one  example  of  many  in  a  relatively  small  sample,   where  the 
administrative  control  instruction,    "Do  not  make  strike-overs"  was  clearly  ignored. 

We  have  noted  previously  that  one  of  the  apparent  paradoxes  in  the  failure  to  date  to  realize 
some  of  the  potentialities  for  automatic  character  reading  which  are  now  available  is  that  of  expecting 
unusually  high  performance  specifications  to  be  met  at  relatively  low  cost.       For  example,   minimum 
reject  rates  are  often  specified  which  are  not  normally  met  in  various  manual  methods.     For  account- 
ing purposes,   of  course,   accuracy  and  reliability  of  data  are  at  a  premium.     In  certain  other 
applications,   however,   a  relatively  high  reject  rate  can  be  accepted  as  economic,   provided  that 
errors  are  low.     For  example,   Lieske  of  the  U.S.    Post  Office  Department  has  been  quoted  as  saying: 

"We're  willing  to  accept  a  high  reject  rate  if  it  makes  greater  accuracy.     Even  if  a 
machine  rejects  half  its  input,   it  would  still  be  an.  enormous  benefit,   but  we  can't  afford 
very  many  misroutings.  "  _!/ 


— '  Cf.     "Progress  being  made",    Ref.    363,   p.   17. 
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A  third  type  of  paradox  lies  in  the  lack  of  objective  evidence  as  to  the  extent  and  variety  of 
actual  requirements  for  automatic  character  reading  equipment  and  the  apparent  lack  of  appreciation 
of  the  potentialities  for  cost  and  time  saving  through  the  use  of  techniques  already  available.     It  is 
hoped  that  continuing  documentation  of  representative  examples  of  the  variety  of  possible  solutions 
offered  may  promote  a  greater  appreciation  of  potentialities,   an  incentive  to  explore  requirements  in 
specific  areas  of  possible  application,   and  a  greater  willingness  to  try,    test,   and  exploit  the 
techniques  that  are  already  available. 

Promise  for  future  progress,    therefore,    lies  first  in  the  recent  organized  interest  on  the  part 
of  potential  users  in  fact  finding  as  to  actual  requirements  and  actual  variety  and  quality  of  prospective 
input,   in  possibilities  for  arriving  at  standardization  recommendations  for  well-defined,    specific 
applications,  _V    and  in  possibilities  for  establishing  realistic  performance  specifications.     Such 
promise  is  evidenced  in  the  organization  of  special  committees  and  task-force  groups  to  explore  the 
situation  on  the  part  of  the  National  Retail  Merchandising  Association,   the  Office  Equipment 
Manufacturers  Institute,   the  American  Standards  Association,   and  others. 

The    other  major  area  of  promise  lies  in  the  related  basic  research  progress  with  respect  to 
pattern  recognition,   machine  models  of  perception  and  concept  formation,   and  self-organizing 
systems.     Progress  in  such  research  fields  points  to  more  than  character  recognition  per  se,   that 
is,   to  problems  of  machine  recognization  of  even  more  complex  patterns  such  as  those  that  are 
significant  in  aerial  photographs  and  in  other  complex  graphic  material.     When  we  consider  the 
future  possibilities  for  machine  processes  in  machine  translation,    information  selection  and 
retrieval,   and  other  aids  to  the  improved  utilization  of  scientific  information,   however,   the  need  for 
both  automatic  character  recognition  systems  and  general  pattern  .recognition  techniques  is  obvious. 


— '  Compare  de  Paris,    J.  R.  :     "Certainly  more  and  more  standardization  of  type  will  come  about. 

This  fact  alone  will  accelerate  the  growth  and  applicability  of  optical  scanning.  "    (Ref.    90, 
p.    39.) 
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APPENDIX 
Bibliography  on  Character  Recognition 

The  bibliographic  citations  which  follow  constitute  an  appendix  to  a  state-of-the-art  review  of 
the  field  of  automatic  character  recognition,   prepared  as  of  May  15,    1961.     Several  different 
categories  of  material  are  included  in  thie  bibliography.     Surveys  and  summaries  of  progress  in 
the  field,   discussions  of  specific  developments,    pertinent  patents,   and  technical  papers  covering 
potentially  related  basic  research  in  pattern  recognition  are  listed  together  in  alphabetic  order  by 
first-named  author  or  by  source.     A  few  references  to  other  material  specifically  cited  in  the  text 
of  the  state-of-the-art  report  are  also  included. 

In  general,   bibliographic  references  to  work  in  the  field  covering  proprietary  information  are 
not  included  in  the  bibliographic  listing  below.     However,    some  material  that  is  available  only  on 
a  restricted  basis  is  included.     In  particular,    certain  unpublished  reports  available  through  the 
Armed  Services  Technical  Information  Agency  (ASTIA)  are  included,    with  ASTIA  numbers  shown. 
It  should  be  noted  that  ASTIA  services  are  available  only  in  support  of  military  research  or 
development  projects  and  contracts.     If  a  request  to  ASTIA  for  material  is  based  upon  the  require- 
ments of  such  a  contract,   the  necessary  forms  and  related  information  will  be  sent  to  the  inquirer 
upon  receipt  of  the  request. 
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THE  NATIONAL  BUREAU  OF  STANDARDS 

The  scope  of  activities  of  the  National  Bureau  of  Standards  at  its  major  laboratories  in  Washington,  D.C.,  and 
Boulder,  Colorado,  is  suggested  in  the  following  listing  of  the  divisions  and  sections  engaged  in  technical  work. 
In  general,  each  section  carries  out  specialized  research,  development,  and  engineering  in  the  field  indicated  by 
its  title.-  A  brief  description  of  the  activities,  and  of  the  resultant  publications,  appears  on  the  inside  of  the 
front  cover. 

WASHINGTON,  D.C. 

Electricity.  Resistance  and  Reactance.  Electrochemistry.  Electrical  Instruments.  Magnetic  Measurements. 
Dielectrics. 

Metrology.  Photometry  and  Colorimetry.  Refractometry.  Photographic  Research.  Length.  Engineering  Metrology. 
Mass  and  Scale.    Volumetry  and   Densimetry. 

Heat.  Temperature  Physics.  Heat  Measurements.  Cryogenic  Physics.  Equation  of  State.  Statistical  Physics. 
Radiation  Physics.  X-ray.  Radioactivity.  Radiation  Theory.  High  Energy  Radiation.  Radiological  Equipment. 
Nucleonic  Instrumentation.    Neutron  Physics. 

Analytical  and  Inorganic  Chemistry.  Pure  Substances.  Spectrochemistry.  Solution  Chemistry.  Standard  Refer- 
ence Materials.    Applied  Analytical  Research. 

Mechanics.  Sound.  Pressure  and  Vacuum.  Fluid  Mechanics.  Engineering  Mechanics.  Rheology.  Combustion 
Controls. 

Organic  and  Fibrous  Materials.  Rubber.  Textiles.  Paper.  Leather.  Testing  and  Specifications.  Polymer  Struc- 
ture.   Plastics.    Dental  Research. 

Metallurgy.  Thermal  Metallurgy.  Chemical  Metallurgy.  Mechanical  Metallurgy.  Corrosion.  Metal  Physics.  Elec- 
trolysis and  Metal  Deposition. 

Mineral  Products.  Engineering  Ceramics.  Glass.  Refractories.  Enameled  Metals.  Crystal  Growth.  Physical 
Properties.    Constitution  and  Microstructure. 

Building  Research.  Structural  Engineering.  Fire  Research.  Mechanical  Systems.  Organic  Building  Materials. 
Codes  and  Safety  Standards.    Heat  Transfer.    Inorganic  Building  Materials. 

Applied  Mathematics.  Numerical  Analysis.  Computation.  Statistical  Engineering.  Mathematical  Physics,  Op- 
erations Research. 

Data  Processing  Systems.  Components  and  Techniques.  Digital  Circuitry.  Digital  Systems.  Analog  Systems. 
Applications  Engineering. 

Atomic  Physics.  Spectroscopy.  Infrared  Spectroscopy.  Solid  State  Physics.  Electron  Physics.  Atomic  Physics. 
Instrumentation.  Engineering  Electronics.  Electron  Devices.  Electronic  Instrumentation.  Mechanical  Instru- 
ments.   Basic  Instrumentation. 

Physical  Chemistry.  Thermochemistry.  Surface  Chemistry.  Organic  Chemistry.  Molecular  Spectroscopy.  Mole- 
cular Kinetics.    Mass  Spectrometry. 
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BOULDER,  COLO. 

Cryogenic  Engineering.  Cryogenic  Equipment.  Cryogenic  Processes.  Properties  of  Materials.  Cryogenic  Tech- 
nical Services. 

Ionosphere  Research  and  Propagation.  Low  Frequency  and  Very  Low  Frequency  Research.  Ionosphere  Research. 
Prediction  Services.    Sun-Earth  Relationships.    Field  Engineering.    Radio  Warning  Services. 

Radio  Propagation  Engineering.     Data   Reduction  Instrumentation.     Radio   Noise.    Tropospheric  Measurements. 
Tropospheric  Analysis.    Propagation-Terrain  Effects.    Radio-Meteorology.    Lower  Atmosphere  Physics. 
Radio  Standards.    High  Frequency  Electrical  Standards.    Radio  Broadcast  Service.    Radio  and  Microwave  Materi- 
als.   Atomic   Frequency  and  Time  Interval  Standards.    Electronic  Calibration  Center.    Millimeter-Wave  Research. 
Microwave  Circuit  Standards. 

Radio  Systems.  High  Frequency  and  Very  High  Frequency  Research.  Modulation  Research.  Antenna  Research. 
Navigation  Systems. 

Upper  Atmosphere  and  Space  Physics.  Upper  Atmosphere  and  Plasma  Physics.  Ionosphere  and  Exosphere 
Scatter.    Airglow  and  Aurora.    Ionospheric  Radio  Astronomy. 
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